HYBRID NAND WITH ALL-BL m-PAGE OPERATION SCHEME

ABSTRACT

This invention discloses 2D or 3D NAND flash array in two-level BL-hierarchical structure with flexible multi-page or random-page-based concurrent, mixed SLC and MLC Read, Program or Program-Verify operations including bit-flipping for each program state or any combinations of above operations. Tracking techniques of self-timed control and algorithm of programming, read and local-bit line (LBL) voltage generations are proposed for enhancing automatic controls over charging and discharging of a plurality of WLs and LBLs in one or more randomly selected Blocks in one or more Segments of one or more Groups in a NAND plane for m-page concurrent operations using Vdd/Vss to Vinh/Vss Program page data conversion, multiple pseudo CACHEs based on LBL capacitors for storing raw SLC and MSB/LSB loaded page data, writing back or reading from Sense-Amplifier, Program/Read Buffer, real CHCHE, and multiple pseudo CACHEs with M-fold reduction in latency and power consumption.

1. CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No.61/920,767, filed Dec. 25, 2013, commonly assigned and incorporated byreference herein for all purposes.

This application is related to U.S. Pat. No. 8,542,535, U.S. Pat. No.8,542,530, U.S. Pat. No. 8,334,551, U.S. Pat. No. 8,456,918, U.S. Pat.No. 8,407,400, U.S. Pat. No. 8,135,913, and U.S. Pat. No. 8,514,636,incorporated by reference herein for all purposes.

2. BACKGROUND OF THE INVENTION

The present invention generally is directed to a multiple-level orparticularly a 2-level metal BL-hierarchical structure for a hybrid NANDarray termed as HiNAND2 array.

In conventional NAND array, the main memory organization and itsassociated DB (Data Buffer), SA (Sense Amplifier), CACHE BuffersBlock-decoder, Row and Column decoders are much simpler but lessflexibility. As a result, there are more latency and power consumptionin all operations and usages. Fundamentally, the conventional NAND arrayis formed into a simple matrix that comprises 1-level BL structure thatcomprises a plurality of NAND blocks cascaded and connected by aplurality of long, tight-pitch (2λ), low-resistance metal1 bit lines(BLs) in parallel in column (y) direction and each block is further madeof a plurality of NAND strings cascaded in row (x) direction. Theinter-string gate connections in x-direction in a same block are easilymade by a plurality of horizontal rows (or pages) using a tight-pitch(2λ) poly-gate word line (WL).

Typically, each conventional NAND String is made of M NAND cells thatare connected in series with one top median-high voltage (MHV) NMOSselect transistor and one bottom MHV NMOS select transistor, whereM=8,16, 32, 64, 128 or any arbitrary integer number. The connections ofall NAND Strings in each column of a NAND array are made between a longtight-pitch 2λ metal BL and each N-active drain contact of each NANDstring. For a conventional NAND array with 1-level BL scheme includingan 8 KB physical page size in x-dimension and 64-cell physical string iny-dimension, there are totally 8 KB metal BLs in one physical page beingdivided into 4 KB odd-number metal1 BLs for an odd-half page and 4 KBeven-number metal BLs for an even-half page with 64 common WLs. Everyphysical page includes two common string-select gate lines per one 8 KBblock formed within a same Triple P-well (TPW) region within a sameN-well region on top of a same P-substrate in one physical 2D NANDplane. Note, the 3D NAND may have different number of WLs per block indifferent P-substrate but have similar 1-level 3D metal BL structure.

In conventional NAND array, the SLC and MLC cells are physically placedin different blocks in either a same plane or different planes. So far,no SLC-page cells and MLC-page cells are physically placed within asingle physical NAND block. As NAND manufacturing technology migratestoward extremely dense 10 nm class node, coupling effects of inter-WLand inter-BL cells become very severe to interfere the cell thresholdlevel Vt. The degree of BL-BL and WL-WL coupling effects is almost sameowing to the identical 1λ physical spacing. As a result, the NAND dataquality and reliability are seriously degraded and require moresophisticated ECC algorithm and techniques applied on-chip or off-chipto correct the rising error bits. In the present application, BL-BLcoupling effect will not be addressed here. This WL-WL couplinginterfere effect is getting worse when all NAND cells of adjacent WLswithin the same block are utilized to store same 4 MLC states. These WLsare referred as MLC-WL or MLC-page of the present application. Thereason of worsen WL-WL interfering effect for a MLC-WL over a SLC-WL isbecause a 4-Vt MLC-WL cell has two more higher positive-Vt state cells,a B-state and a C-state, than a 2-Vt SLC cell which has only one lowerpositive-Vt A-state cell. The degree of coupling effect is proportionalto the relative cell's Vt difference between any two adjacent cellsphysically residing on either different WLs but in same BL or differentBLs but in same WL.

The higher MLC Vts of B-state and C-state store more electrons in a2-poly floating-gate NAND cell or in a 1-poly charge-trapping NitrideNAND cell of the selected MLC-page, the more WL-WL coupling interfereeffect will be than a SLC cell for storing at lower Vts of E-state andA-state. Thereby, it would result in the selected MLC cells with Vtlevels of E-state, A-state, and even B-state being increased more by twoMLC's C-state cells from two physically adjacent, top and bottom,MLC-WLs than SLC cells with E-state being increased by two adjacentSLC's B-state cells from two physically adjacent SLC-WLs.

Conventionally, the NAND memory is formed entirely separated as either aMLC NAND or a SLC NAND in separate chip. But due to new NAND applicationneed recently, a hybrid NAND design containing both types of MLC and SLCstorages in one die is required. When it comes to a hybrid MLC and SLCNAND design, reliable maintaining both data quality and integrity of SLCand MLC of NAND array in one chip is a big concern. There are many waysto construct this hybrid NAND array in prior art. One example, eachphysically separate plane of a hybrid NAND array to respectively acts asa big unit to store SLC-only data or MLC-only data. This can be referredas a plane-based hybrid NAND array.

Another example, each physically separate Block of a hybrid NAND planeto respectively acts as a medium unit to store SLC-only data or MLC-onlydata. This can be referred as a block-based hybrid NAND array. The blockthat stores SLC data is referred as SLC-block, while the block thatstores MLC data is referred as MLC-block.

For above conventional SLC-block, the WL-WL coupling effects from twoadjacent WLs is less severe, thus the cell data quality and integrity ofeach SLC-block can be reliably maintained. By contrast, the cell qualityof each MLC-block is greatly degraded due to much more severe WL-WLcoupling effect when the technology migrates to below 20 nm node.

As a result, a conventional hybrid NAND memory inherently providesimproved quality of SLC data along with a degraded MLC data, however,MLC data is prone to be corrupted so that a more sophisticated ECCmethod is required to correct those MLC errors when it happens.

Additionally, the conventional NAND Erase scheme is performed in a unitof a physical block. In other words, all physical WLs' NAND cells in onephysical NAND block are getting erased collectively and concurrently. Sofar, a page-based Erase operation is not allowed, not mentioning arandom page-erase operation. This is a tremendous inconvenience ofupdating file system if only a small size of page change is required. Asa result, many unnecessary big-block erase and a plurality of small pageprograms are disadvantageously performed so that NAND P/E endurancecycles are greatly reduced for both SLC, MLC, TLC and even XLC NAND,regardless of 2D or 3D NAND, or regardless of 2-poly floating-gate NANDor 1-poly charge-trapping Nitride NAND, regardless of PMOS NAND or NMOSNAND.

Typically, the state-of-art 2D NAND cell's Erase operation is performedby using a FN-channel tunneling scheme to decrease cell's Vts from allhigh programmed positive-Vt states to a low negative-Vt state,regardless of SLC, MLC, TLC, XLC and even Analog storages. As defined ina conventional NAND cell Vt distribution diagram, commonly there is onlyone negative-Vt state (Vte≦−0.5V) which is termed as E-state with databeing set to “1” for all storage types. Similarly, NAND programoperation also uses same FN-channel tunneling scheme to increase cell'sVt to multiple higher positive values. The number of the positive valuesof multiple programmed Vt states is determined by NAND cell's preferredstorage type. For example, a SLC-type NAND cell has only one positiveprogrammed Vt state which is referred as A-state with a center value ofVta set to be around 1.0V. For a MLC-type NAND cell, it has threepositive programmed states conventionally termed as A-state, B-state,and C-state with a respective Vt center value set as Vta=1.0V, Vtb=2.0V,and Vtc=3.0V. Similarly, a TLC cell has seven positive programmedstates, a XLC cell has fifteen positive programmed states and a AnalogNAND cell can have up to 255 positive programmed states but with a muchnarrowly-spaced Vt spacing between any two adjacent programmed states.Typically, a Vt spacing, ΔVt1, between the negative erased state of Vteand next positive programmed state of Vta is much larger than other Vtspacing, ΔVts, between any two higher positive programmed Vt states suchas ΔVt2=Vtb−Vta, ΔVt3=Vtc−Vtb, etc. In NAND design with more multiplepositive programmed states, it has smaller ΔVt in a nLC NAND cell. As aresult, nLC is more prone to data corruption, when n>1 (for MLC, XLC,TLC, etc.).

When performing a page-program operation on a selected page of astate-of-art NAND product, a pre-erase-before-program is required. Thefirst step is to follow a block-erase command to carry out a block-basedErase operation. After a long time of about 2-5 ms erase time, allNANDs' Vts in a selected block that contains the selected page are resetto E-state with Vte below 0V. Even a typical NAND block comprising Nphysical pages of WLs (N=16, 32, 64, or 128) only needs to change onesingle page (WL) data, the whole block has to be erased to allow one oreven partial WL program.

The final required erase time is heavily depending on NAND cell'sstorage type, block size, technology node of 2D or 3D NAND flash.Although program time, program gate voltages, and Program-Verifyconditions vary with Vt and the storage types of SLC, MLC, TLC, XLC orAnalog, the erase time remains the pretty much identical because so farthe NAND cell has only one erased state with one erased Vt=Vte. Duringan Erase operation, the TPW of the selected NAND plane of NAND array hasto be coupled with a HV of +20V along with the multiple WLs in oneselected erased block coupled to 0V to induce the desired FN-channeltunneling effect to remove the stored electrons out from thefloating-gate of 2-poly NAND cells or the Nitride charge-trapping layerin 1-poly SONOS or MONOS NAND cells.

For those cells in the unselected WLs and in the unselected blocks areleft in a floating state initially so that when the subsequent voltagerise of the selected TPW to 20V for Erase operation, the unselectedfloating WLs' voltage would also be coupled up to near 20V so that thevoltage drop between the unselected WLs and common TPW would be small.As a result, no FN-tunneling effect would be induced. Thus, the NANDcells' Vts in the unselected blocks in both selected or unselected NANDplane would remain unchanged with a negligible ΔVt.

Since the NAND erase size in spec is performed in unit of Block, but theNAND program size is performed in unit of single page, thus there aremany disadvantages in above said Erase-before-Program step, as listedbelow.

-   -   1) Erase-before-Program step reduces NAND's limited P/E        endurance cycles:        -   Any one page or partial Program operation of NAND requires a            Block-erase to reset all NANDs' Vts to Vte of all pages in            one selected Block. Even there is only one page to be            programmed with a new data, the rest of M-1 WLs in the            selected Block need to be reprogrammed back with the old            data in the selected erased Block. Therefore an operation of            any single page data change in a selected Block, cells in            each of M-1 pages suffer unnecessary one erase 20V TPW            stress and M-1 page program 20V WL voltage stresses, thus            NAND's P/E endurance cycles would be degraded and reduced            per each new data change.    -   2) Erase-before-Program step drastically shorten NAND's life        cycle:        -   In the state-of-art 2D or 3D NAND spec, the Block-erase time            is set to be ˜2.5 ms per block. But SLC program time spec is            set to be 250 ms. Thus each Block-erase time is about            10-fold of each page-program time. Thus, the lengthy            Erase-before-Program operation rather than page-program            operation is the speed bottleneck to change NAND data.    -   3) Block-erase step results in more power consumption than        unnecessary page-program:        -   In the pre-erase step, a 20V HV has to be coupled to the            selected TPW in selected NAND plane. Typically, a whole big            NAND array is only being divided into 2 or 4 or 8 planes.            Thus even only one NAND plane is selected for Erase, the 20V            HV requirement in the selected NAND plane is still consuming            huge power because the area of the selected TPW area is            still a big parasitic capacitor that needs to be filled up            with a 20V by An Erase pump circuit. Although the            Block-erase time seems to be same as page-erase, the            unnecessary extra program time and power consumption for M-1            pages slows down and degrades overall operation.

Since it is impractical to physically separate the common TPW betweenany adjacent WLs or BLs without a big penalty in NAND array layout area,the conventional NAND spec only allows the Erase operation to beconducted on a block-based manner, rather than on a page-based manner asthe page-program operation.

Conversely, NAND Program operation only applies Vpgm of 20V to aselected WL and Vpass (of 10V) to the remaining M-1 WLs of the selectedBlock with TPW coupled to Vss.

Thus in NAND Program operation, no big TPW disturbance exists at theunselected pages (WLs) and Blocks. Thereby, the NAND Program isperformed on page-based manner with a page-program time of approximate250 μs for a SLC cell as only one programmed A-state is required. For aMLC cell a 3-fold program time of 750 μs is required as three programmedstates of A-state, B-state, and C-state are required. Further, thepage-program time is proportionally increased even more when programminga TLC-type cell and a XLC-type cell or an Analog cell. Practically,Erase operations in prior art NAND can be randomly and independentlyexecuted on single page base. But in reality, due to above said programsequence drawback, typical NAND Erase spec only allows the block-basedErase, no random page-based Erase is allowed.

Moreover, even the spec of Program operation is allowed to be performedin unit of page, the sequence of page-program WL in the selected Blockhas a stringent restriction. The sequence of NAND's page-program in theerased Block has to start from the first WL nearing the bottom of NANDStrings to the last WL nearing the top of NAND Strings. Usually, theBlock size comprises 64 WLs, although other numbers like 32 or 128physical pages or WLs are also used. The page-program sequence startsfrom WL0, then WL1, then to last WL63 in a 64-cell NAND strings. Arandom WL program is prohibited in the conventional NAND array,regardless of SLC or MLC and 2D or 3D type of cells.

One conspicuous reason to prohibit the conventional NAND from providingrandom page program scheme is because the limitation of all self-boost(SB) Program-Inhibit schemes, regardless of SB, ESB and LSB techniques.A successful random page program operation requires a SB program-inhibitscheme to well boost the channel's initial inhibited voltage of Vdd orVdd-Vt to more than 7.0V when a selected WL is coupled with a risingVpgm=20V and the rest of N-1 unselected WLs in the selected block arecoupled with a rising Vpass=10V. In a random-page program scenario, itmeans the unselected N-1 NAND cells in WLn+1 or WLn-1, above or belowthe selected WLn, may be in the programmed states with Vta, Vtb, and Vtcwhich are higher than Vte (a negative Vt) state. Then the Vpass couplingfrom gates of non-selected WLs to channels of non-selected cells wouldbe drastically reduced to 3V. As a consequence, the selected WL couplingof 20V would not be strong enough to boost the channel of selected cellabove 7V due to its leakage to the adjacent under-boosted channels ofunselected cells in the same NAND string. Thus the program operation mayfail and the situation becomes worse if the cells in a same selected WLbut in two adjacent BLs are in program state with channel voltage at Vssbecause the boosted WL voltage of 20V has to boost higher BL voltagecoupling BL parasitic capacitance.

But if the program sequence starts from NAND bottom with theunprogrammed NAND cells' Vt in WLn+1 to top WL being still in Vtestates, that would maintain a high Vpass coupling ratio to assist Vpgmboosting on the channel of selected cell so that a higher success ratein program-inhibit operation.

For the reasons stated above and strong market demands, it is desirableto provide advanced NAND products for supporting random-page operationalong with the page-based Erase operation to avoid the unnecessary Eraseand Program operations to those unselected pages or WLs. It is desirableto have new hybrid NAND memory technique to improve MLC data quality.Further it is desired to achieve a faster write operation with a higherP/E endurance cycle and superior NAND data integrity and reliability fora low-current NAND regardless of 2D or 3D NAND, or 1-polycharge-trapping Nitride SONOS or MONOS NAND, or 2-poly floating-gatetype NAND for either PMOS or NMOS NAND.

3. BRIEF SUMMARY OF THE INVENTION

The present invention is related to techniques of NAND operations.Particularly, embodiments of a hybrid SLC/MLC NAND scheme can providenot only all conventional operations such as paged-based Read,page-based Program, paged-based Program-Verify but Block-based Erase andErase-Verify, but also provide novel concurrent multi-page Read,multi-page Program, multi-page Program-Verify, multi-random-page Eraseand Erase-Verify on either consolidated or dispersed Blocks as well asmixed page-based and multi-page SLC and MLC operations.

In certain embodiments, the present invention discloses multi-page(m-page), page-based, or mixed SLC and MLC NAND operations that havemany advantages over prior-art NAND. These preferred NAND operationsinclude concurrent/pipeline Program, Program-Verify, Erase-Verify andRead, BL-precharge, BL-discharge, WL-precharge, and WL-discharge in allNAND planes as long as no contention happening in BL lines, WLs, andcorresponding XT, GSLp, and SSLp bus lines. The XT, GSLp, and SSLp buslines are the commonly shared row lines for all WLs and twostring-select control lines SSLs and GSLs per block. In other words, thesubsequently disclosed techniques of the present invention allow thefollowing random SLC, MLC MSB, and MLC LSB operations in all NAND planessimultaneously and independently:

-   -   a) Random page-based and m-page Erase in each independent        Segment,    -   b) Random page-based and m-page SLC or MLC Program in each        independent Segment,    -   c) Random page-based and m-page SLC and MLC Program-Verify in        each independent Segment,    -   d) Random page-based and m-page Erase-Verify in each independent        Segment,    -   e) Random page-based and m-page SLC and MLC Read in each        independent Segment.    -   f) Self-timed concurrent WL HV voltage and operating time        controls. The automatic WL HV voltages include Vpgm, Vpass, and        V_(READ) for various operations. The HV time controls include        detection, precharge and discharge of WL of specific voltages        for varied operations such as Program, Program-Verify,        Erase-Verify and Read for NAND.

Note, in the present invention, one option is that a full physical WLpage can be divided into N-bit odd ½-page (or simply called odd page)and N-bit even ½-page (or simply called even page) to respectivelyaccommodate for two ½-page N-bit top-level metal2 GBLs but one 1-page2N-bit lower-level metal1 LBLs in a HiNAND2 array. In another option, afull physical WL page can be divided into first ¼-page, second ¼-page,third ¼-page, and fourth ¼-page to accommodate for a ¼-page, N-bit,metal2 GBL lines but a full-page, 4N-bit, metal1 LBL lines in a HiNAND2array. The bit number of Data Buffer (DB) is preferably kept the samenumber of GBL lines to save DB area and have an easier layout between DBand HiNAND2 array in BL areas.

Before undertaking the detailed description below, it is advantageous toset forth definitions of certain words, phrases used throughout thispatent application: the terms “include” and “comprise,” as well asderivatives thereof, mean inclusion without limitation; the term “or,”is inclusive, meaning and/or; the phrase “associated with” as well asderivatives thereof may mean to include, be included within,interconnect with, contain, be contained within, cooperate with, connectto or with, couple to or with, be communicable with, cooperate with,interleave, juxtapose, be proximate to, be bound to or with, have, havea property of, or the like; and the terms of “2D” or “3D” NAND flashmeans any part or whole of either NMOS or PMOS NAND cell device,component, die or chip made of any varied kinds of two-dimension orthree-dimension manufacturing process, technology, designs or the like,associated with 2-poly floating-gate or 1-poly charge trapping Nitridelayer 3D cell structure; any Vertical-gate or Vertical channel 3D NANDcell structure or the like or the derivatives are included.

Note, the definition of terminology of a “Block” is meant as a“Physical-Block” or “Consolidated Block” in prior-art NAND. In thisapplication “Block-based operation” means “Dispersed Block” or“Logic-Block” which contains selected multiple physical pages or WLs andeach page or WL is from a random “Physical-Block” of one random Segmentof one random Group in a NAND plane. For example, a conventionalPhysical-Block definition, a NAND string has M cells connected inseries. Then each Physical-Block contains a plurality of M-WL Stringswhere M=8, 16, 32, 64, 128 or any integer number. During conventionalblock-erase, all NAND cells in all selected M pages in a single oneselected Physical-Block would get erased simultaneously. If M=64, thenany WL number ≦63 is not allowed to perform erase operation because theconcern of unselected NAND cells data in the non-selected M pages insame Physica-Block would either be unintentionally lost or corrupted dueto the severe coupling effect of common P-well being coupled to an HVerase voltage of 20V. This is called a “Consolidated Block” or“Physical-Block” in prior art NAND array and its associated erase set upconditions.

Conversely, a m-page “Block-based” operation of the present inventionmeans operation on one “Logic-Block” that comprises “multiple physicalWLs or multiple physical pages” dispersed randomly in multiplePhysical-Blocks in multiple physically dispersed Segments of multiplephysically dispersed Groups in one NAND plane. Each of the “Multiple”,denoted as m, physical WLs or pages is preferably selected from onePhysical-Block within one selected Segment to use one Segment C_(LBL)capacitor at a time during a preferred m-page Operation. The m-pageoperation means “plurality” or “many” selected WLs in m Blocks areselected for concurrent operations such as Read, Program,Program-Verify, Erase-Verify, full-page (full WL) or partial-page(Odd/Even halved, partial interleaved or non-interleaved) C_(LBL)capacitor precharging, discharging, WL precharging and dischargingoperation. “Concurrent” means more than one WL or page are selected forperforming same or different key NAND operations at the same time.“Concurrent” does not means the whole operating duration or interval orperiod but means that at least in some overlapping time intervals morethan one operations are performed in different Segments in a same ordifferent Groups of a same HiNAND plane so that the NAND idle time canbe dramatically reduced, thus the NAND storage usage in system can bemaximized.

Note, although “Dispersed Block” means “multiple WLs from more than onedifferent Physical Blocks”, only one selected WL in one selected Blockper one selected Segment is allowed. But one or more Segments can beselected in one Group simultaneously in a NAND plane formed under theHiNAND2 array structure. Note, one selected Block within a Segment of aGroup of a plane means one physical NAND Block, rather than multiple.The total number m of selected WLs in the Dispersed Block can be veryflexible. Larger number of m is preferred to save more than m-foldreduction in power consumption, latency and operation time for m-pageRead, Verify and Program operations. The final m number choice is atradeoff between performances, power consumption, latency, silicon area,and the extent of circuit complexity implemented by on-chip Controllerdesigns.

Some terminologies defined by the present invention are summarized andexplained below (Referring to HiNAND2 of FIG. 1A).

1) Data Buffer (DB): It is used to store single N-bit page data. It iscomprised of three circuits with same bit length such as 1-bitMultiplier, 1-bit Latch-type SA and 1-bit Program/Read Buffer (P/RB).

-   -   2) P/RB: N-bit Program/Read Buffer        -   a) “0” page bit data is to pass Vss to channel of a program            cell through each corresponding metal2 GBL and metal1 LBL.        -   b) “1” page bit data is to pass Vdd to channel of a            program-inhibit cell through each corresponding metal2 GBL            and metal1 LBL.    -   3) N-bit real CACHE Register is made of glue logic circuit to        store one inputted N-bit page data for a preferred m-page        Program operation.    -   4) CACHEcel: This is a first 2N-bit pseudo CACHE Register by        using 2N metal1 LBL capacitors (C_(LBL)) to temporarily store        current new page data.

5) CACHEint: This is a second 2N-bit pseudo CACHE Register, paired withCACHEcel, by using 2N metal1 LBL C_(LBL) capacitors to temporarily storelast odd-number N-bit and even-number N-bit transient page data.

-   -   6) CACHEmsb: This is a third 2N-bit pseudo CACHE Register by        using 2N metal1 LBL C_(LBL) capacitors to temporarily store last        odd-number N-bit and even-number N-bit MLC's MSB page data.    -   7) CACHE1sb: This is a fourth 2N-bit pseudo CACHE Register,        paired with CACHEmsb, by using 2N metal1 LBL C_(LBL) capacitors        to temporarily store last odd-number N-bit and even-number N-bit        MLC's LSB page data.

Note, each name of above four pseudo CACHE registers is not given to afixed 2N metal1 capacitor but is used in rotational manner. In otherwords, the name of CACHEcel is used whenever a selected WL is within it.For example, if the first CACHE has the selected Block that contains theselected WL, then this CACHE is termed as CACHEcel. The second CACHEthat is connected to the CACHEcel is preferably termed as CACHEint whena TIE transistor, MLBLb, is used to connect them. But subsequently whena selected WL comes from the second CACHE, CACHEint, then the name ofthe second CACHE will be changed to CACHEcel and the paired first CACHEshould be renamed to CACHEint.

Similarly, the third CACHEmsb and fourth CACHE1sb work as a pairedCACHEs joined by one TIE transistor, MLBLb, as above the first CACHEceland the second CACHEint. If one WL in one Block in the third CACHEmsb isselected, then the name of the third CACHEmsb should be turned to thefirst CACHEcel and the paired fourth CACHE1sb will become the secondCACHEint. Accordingly, the first CACHEcel will be changed to the thirdCACHEmsb and the second CACHEint would be changed to the fourth CACHE1sbfor these rotational names for four selected CACHEs.

In an embodiment, the present invention provides a hybrid HiNAND2 arrayconfigured to interleavely mix both MLC-WLs and SLC-WLs in one physicalBlock so that the severe MLC WL-WL interference coupling effect can begreatly reduced. Each SLC-WL is purposely inserted between two adjacenttop and bottom MLC-WLs, vise versa; each MLC-WL is inserted between twophysically adjacent top SLC-WL and bottom SLC-WL. In this configuration,no single selected MLC-WL's MLC NAND cells' Vts will suffer the WL-WLcoupling effect from two physically adjacent MLC-WLs' MLC NAND cells.Each SLC-WL is used as a WL-buffer between two physically adjacentMLC-WLs.

More specifically, the present invention will disclose a hybrid SLC andMLC HiNAND2 array formed preferably with a physically interleaved SLC-WLand MLC-WL in every NAND Block. For example, if a SLC-WL is formed inevery odd WL, then a MLC-WL is preferably formed in every even WL. For atypical 64-cell hybrid HiNAND2 String, the preferred interleaved WLarray means it is organized with 32 odd-number SLC-WLs such as SLC-WL1,SLC-WL3, . . . , SLC-WL61, and SLC-WL63 and 32 even-number MLC-WLs suchas MLC-WL2, MLC-WL4, . . . , MLC-WL62, and MLC-WL64 or vice versa. Thetop and bottom string-select transistors are kept same as prior art.

In addition, more advantages are found for the present invention becausethe circuits of Y-pass of BLs, Drivers of WL voltages, X/Y Decoders, SAsand P/RBs, GBLs and LBLs can be shared in same physical NAND plane.Thus, more flexibility of NAND operations and area reductions can beachieved. As a result, the advantages of the mixed SLC and MLC in NANDcan be fully attained without any sacrifice of the array area and NANDdata quality and reliability.

Besides the disclosures of above novel 2-level BL-hierarchicalarchitecture with interleaved SLC-WL and MLC-WL in one physical Block,the present invention also discloses many preferred advantageousflexible m-page-based NAND operations to replace the conventional slowsingle-page-based NAND operations including Program, Read,Program-Verify with a disadvantageous restriction of page-Programsequence that must be performed from NAND String bottom to NAND Stringtop without flexibility.

Each 2N-bit MLC physical page data is divided into 2N-bit MSB logic pagedata and 2N-bit LSB logic page data, while SLC has only one 2N-bit pagedata. Therefore, for a hybrid HiNAND2 design with interleaved MLC andSLC WLs, two 2N-bit CACHEcel capacitors are required to store both2N-bit MSB logic page data and 2N-bit LSB logic page data per onephysical WL per one Segment but only one 2N-bit CACHEcel capacitor isrequired for SLC page data per one Segment.

Note, in the present application, 2N-bit or 8 KB of one full-page of onefull physical WL or N-bit or 4 KB terms of one ½-page or one ½ physicalWL page are alternately referred in many embodiments shown in followingsections of the specification. An 8 KB WL means 8 KB NAND cells areconnected to one physical WL without including any extra ECC paritybytes for description simplicity. Moreover, each physical WL length canhave smaller numbers of NAND cells such as 4 KB, 2 KB, 1 KB and even512K. In application, whenever one physical WL's NAND cell number isdecreased, the bit number of metal2 GBLs and DB (Data Buffer) are alsoscaled down as well. As a result, the present invention enables to keepusing relaxing layout design rules for metal2 GBLs and GBL-interfacepitch issue between the NAND array and DB circuit.

The present invention also provides several preferred HiNAND2hierarchical arrays that comprise a plurality of pseudo CACHE registersthat are made of a plurality of bottom-level metal1 LBLs and top-levelmetal2 GBLs and the associated peripheral circuits such as DBs thatcomprise a plurality of Multipliers, SAs, P/RBs, Program-Verifydetecting circuit, real CACHE, Block-decoders, Segment-decoders, a DummyWL charging Vpgm, Vpass, Vread, and discharging Vss voltage detectors, aBL LBLps charging and discharging detector, Power-down detector forlatching plurality of BL program page data.

The varied CACHE metal1 C_(LBL) and metal2 C_(GBL) capacitors the noveltechniques of LD/LT (Loading and Latching) page data, precharging VinhProgram-Inhibit voltage, and converting Vdd/Vss data pattern to Vinh/Vsstogether provide the most flexible and reliable Read, Program, and Erasequality with potential up to multiple fold reduction in latency andpower-consumption for above preferred M-page concurrent operation. Note,the disclosed circuits, flows and methodologies can be applied to allkinds of NAND flash memories, regardless of 2D or 3D NAND, regardless of2-poly floating-gate or 1-poly charge-trapping Nitride NAND flash,regardless of PMOS or NMOS NAND flash technology.

According to certain embodiments of the present invention, the preferredlow-current FN-channel tunneling scheme is used for both Program andErase operation. In contrast, both Program and Erase operations of thepresent invention are preferably performed in unit of both random pageand random Block with only restriction at one selected page per oneSegment. There is no restriction in Program sequence as the conventionalprogram operation which has to start from bottom of a NAND String to topof NAND String with very limited flexibility. All Read, Program andErase can be performed in one or more random physical pages in one ormore physical Groups with one selected page per one Segment to avoid theundesired contentions in plurality of metal1 LBLs and metal2 GBLs and 64XTs, 1 SSLp and 1 GSLp bus lines.

According to some aspects of these embodiments of the present invention,the preferred HiNAND2 m-page Program operations, an Erase operationbefore performing every random page Program is required. In thefollowing summarized inventive objectives of the present invention, thereference is made with respect to the accompanying drawings, flows andtables that form a part hereof, and in which is shown, by way ofillustration, specific embodiments in which the disclosure may bepracticed. In the drawings, like numerals describe substantially similarcomponents throughout the several views. These embodiments are describedin sufficient detail to enable those skilled in regular NAND art topractice the embodiments to capture the foundations of the followingclaimed objectives. Other embodiments may be utilized and structural,logical, and electrical changes may be made without departing from thescope and objectives of the present disclosure. The following detailedobjectives and descriptions, therefore, not to be taken in a limitationsense.

In a specific embodiment, the present invention provides a first optionof HiNAND2 array structure as shown in FIG. 1A that is comprised of aplurality of rows and columns with a 2-level metal BL-hierarchicalstructure. The top-level BL is preferably comprised of J broken 4k-pitch metal2 GBLs per two HiNAND2 columns, where J is an arbitraryinteger with a value more than one, e.g., J≧2. The whole HiNAND2 arrayin y-direction is being divided into a J Groups such as Group 1 to GroupJ by using J-1 GBL divided device which is a MHV (˜10V) NMOS transistor,MGBL, such as MGBL₁ to MGBL_(J-1). Each Group size can be flexibly madeto be same or different. The bottom-level BL is comprised of L broken2λ-pitch metal1 LBLs per one HiNAND22λ-pitch columns, where L is anarbitrary integer greater than one, e.g., L≧2. Each Group is preferablyfurther divided into L/2 pairs of Segments. Each pair of Segmentscomprises two Segments arranged in BL direction being physically tied byone row of LBL-divided transistors, MLBLb, which can connect ordisconnect the two Segments electrically.

Each Segment further includes K Blocks that are connected by onebottom-level metal1 2λ-pitch LBL. Each Block is comprised of 2N NANDStrings cascaded in x-direction (row) and each String is comprised of MNAND cells connected in series with a first BL-select transistor MS anda second SL-selected transistor MG, where M=64 in some examplesexplained within this application, though other integer is possible. Inthis HiNAND2 array with N top-level metal2 4λ-pitch GBLs and 2Nbottom-level metal1 2λ-pitch LBLs, the preferred minimum DB size is 4 KBwith N-bit to accommodate for 4 KB N top-level metal2 GBLs.

In another specific embodiment, the present invention provides a secondoption of a HiNAND2 array structure that is comprised of a plurality ofrows and columns with same total memory bits and a similar 2-level metalBL-hierarchical structure with top-level N, 2 KB, 8λ broken metal2 GBLsbut keeps same bottom-level 4N, 8 KB, 2λ metal1 LBLs. The whole HiNAND2array in y-direction (column) is divided into J Groups such as Group 1to Group J by using J-1 GBL-divided devices, each of which is a MHV(˜10V) NMOS transistor, MGBL, such as MGBL₁ through MGBL_(J-1). EachGroup size can be flexibly made to be the same or different. Each of thebottom-level BL is comprised of L metal1 2λ-pitch broken-LBLs per oneHiNAND2 column, where L is an arbitrary integer with a value more thanone, e.g., L≧2. Each Group is preferably further divided into L numberof Segments. Between one pair of Segments in bit line direction there isone row of 2λ-pitch metal1 LBL-divided devices, which are MHV (˜10V)NMOS transistors of MLBLb for connecting or disconnecting the pair ofSegments. Each Segment is further made of K Blocks that are connected by4N broken metal1 2λ-pitch LBLs. Each Block is comprised of 4N NANDStrings cascaded in x-direction and each String is comprised of M NANDcells in series with one BL-select transistor MS and one SL-selectedtransistor MG, where M=64 in this example.

Note, further bit number of DB and real CACHE reduction from 4N to N, by½^(t)-fold where t=2. This is a tradeoff between area saving and pageloading cycles of DB and real CACHE. In this HiNAND2 array with N metal28λ-pitch GBLs and 4N metal12λ-pitch LBLs, the required minimum DB sizeis N-bit to accommodate for N metal2 GBLs. Thus DB size is furtherreduced.

In yet another specific embodiment, the present invention provides a2-Vt SLC and 4-Vt MLC hybrid HiNAND2 array structure that is preferablyorganized to have one SLC-WL in odd/even WL and alternatively have oneMLC-WL in another even/old WL. In other words, all SLC-WLs aresurrounded by two adjacent MLC-WLs or vice versa all MLC-WLs aresurrounded by two adjacent SLC-WLs to avoid the worst coupling betweentwo MLC WLs.

In still another specific embodiment, the present invention discloses apreferred set of bias conditions for arbitrary number up to M WLs perBlock selected from arbitrary number of Blocks in arbitrary number ofSegments in one or more Groups and Planes for Erase operation includingsetting the selected WLs to 0V; setting selected TPW in the selectedplane to Verase (20V in this example); then subsequently floatingunselected WLs, SSLs, GSLs in any selected and unselected Blocks,Segments, Groups, and Planes.

In yet still another specific embodiment, the present inventiondiscloses a preferred set of the bias conditions for m WLs selected onone-WL-per-Block in one-Block-per-Segment basis from m number ofselected Blocks of m selected Segments in one or more Groups and Planesfor an Erase-Verify operation including setting all selected m WLs to0V; setting unselected (M−1) WLs in each selected Block to Vread (˜6V);selected SSLs and GSLs to Vdd or Vread; setting unselected WLs, SSLs andGSLs of unselected Blocks to 0V; setting TPW in the selected plane to0V. The condition of pass or fail of Erase-Verify operation isdetermined by voltages of all selected m×2N C_(LBL) capacitors. If theprecharged Vinh voltage of any one of the m pages of all 2N C_(LBL)capacitors changes to Vss, then the Erase-Verify for the correspondingpage passes, otherwise, it fails. The passed pages then should beprohibited from being further erased. Iterative Erase operation shouldbe continued on the failed pages.

In an alternative embodiment, the present invention provides a preferredLBL's Program-Inhibit/Program voltages of Vinh/Vss replacingconventional Vdd/Vss for m selected 8 KB page data to achieve thesuperior m-page Program-Inhibit and Program conditions in accordancewith m-page MLC or SLC Program page patterns. This is referred as asuper self-boosting Program-Inhibit (SSBPI) scheme with a much higherinitial precharged LBL voltage of Vinh than Vdd or Vdd-Vt used in priorart.

In another alternative embodiment, the present invention provides atechnique of Vdd/Vss to Vinh/Vss conversion at each selected LBL forseveral preferred m-page operations such as Program, Program-Verify,Erase, and Read storing Vdd/Vss at each P/RB in accordance with eachcorresponding SLC or MLC bit data through each corresponding metal2 GBL.The Vinh is concurrently supplied by a metal0 power line LBLps Driverper each selected Segment.

In yet another alternative embodiment, the present invention provides acontrollable DRAM-like charge-sharing on page-by-page basis between onerow of N small metal1 parasitic C_(LBL) capacitors and up to Jcorresponding N large metal2 parasitic C_(GBL) capacitors for each SLCor MLC random page Read or Program-Verify.

Before performing charging-sharing, the initial LBL voltages of aselected page of cells are Vinh/Vss after voltage conversion and theinitial GBL voltage is reset to Vss in all connected Groups. During GBLand LBL charge-sharing step, the corresponding MGBL transistors have tobe turned on to allow each sensed analog signal of Vinh in C_(LBL)capacitor is shared and diluted by up to J C_(GBL) capacitors when GroupJ's Segment is selected along the GBL for each corresponding Multiplierto perform the first analog amplification. Then the diluted analogsignal is amplified by a corresponding Multiplier to a final fulldigital Vdd/Vss signal and transferred and latched in each correspondingbit of P/RB.

In still another alternative embodiment, the present invention providesa Segment decoder circuit with 3 LV (Low-Voltage) inputs of Ri, Tj, andGk supplied by three pre-decoders, R-dec, T-dec, and G-dec, shared bytwo separate outputs, SEGo for Odd selection of a Segment and SEGe forEven selection of the Segment, incorporating with one local VHV pumpcircuit and one paired MHV inputs of SEGpe and SEGpo. The two outputs ofSEGe and SEGo are used to control respective Even and Odd NMOS MHV(Medium-High-Voltage) Segment-select transistors, MLBLpe and MLBLpo, asshown in FIG. 1A. When an Even MLBLpe is selected during thecharge-sharing cycle between each Even metal1 C_(LBL) capacitor and eachcommon metal2 C_(GBL) capacitor, then SEGpe≧Vinh+Vt+1V (1V is a marginvoltage) to allow Vinh (pass Program-Verify or Read) to be fully passedto each corresponding 4 KB metal2 GBL. When SEGpo and SEGo are set toVss, thereby disconnecting 4 KB Odd metal1 LBLs of the Segment from 4 KBmetal2 GBLs so that the bus contention can be avoided. The connectionand disconnection of each SEGe and SEGo can be independently set withfollowing preferred conditions:

-   -   a) SEGe≧Vinh+Vt with SEGo=Vss when 4 KB Even CACHE C_(LBL)        capacitors of one Segment are selected only, for 4 KB Even LBLs        charge-sharing with common 4 KB GBLs,    -   b) SEGo≧Vinh+Vt with SEGe=Vss when 4 KB Odd CACHE C_(LBL)        capacitors of one

Segment are selected only, for 4 KB Odd LBLs charge-sharing with common4 KB GBLs,

-   -   c) SEGe=SEGo=Vss when both 4 KB Even and 4 KB Odd 4 KB CACHE        C_(LBL) capacitors of one Segment are not selected for any        charge-sharing operation.

In still another alternative embodiment, the present invention disclosesN paired Segments selections when one full or partial of 4 KB page dataare selected for loading from 4 KB DBs or 4 KB SAs. The value of N≧2 andN is an integer. This condition is frequently used in the presentinvention along with a TIE transistor that connected one paired CACHEesuch as CACHEcel and CACHEint.

In yet still another alternative embodiment, the present inventiondiscloses a second preferred Segment decoder circuit with 3 similar LV(Low-voltage) inputs of Ri, Tj, and Gk supplied by 3 pre-decoders,R-dec, T-dec, and G-dec shared by four separate inputs, SEGpa, SEGpb,SEGpc, and SEGpd and four separate outputs SEGa, SEGb, SEGc, and SEGdbut one shared VHV pump circuit. The four outputs of SEGa, SEGb, SEGc,and SEGd are used to control four respective Segment-select transistors,MLBLpa, MLBLpb, MLBLpc, and MLBLpd, as shown in FIG. 1B. When one of thefour Segment-select transistors MLBLpa-MLBLpd is selected forcharge-sharing, then the rest 3 of Segment-select transistorsMLBLpa-MLBLpd transistors have to be shut off to disconnect thecorresponding C_(LBL) capacitors from 4 KB metal2 GBLs withoutdisturbing on-going charge-sharing on the one sector of the Segmentselected by the one of four outputs of SEGa, SEGb, SEGc, and SEGd.

In yet still another alternative embodiment, the present inventionprovides a preferred Segment-decoder function to instantly set a HXSnode to Vss by setting one-shot pulse of ESB=Vdd for a preset durationwhen unintentional Vdd power lose is detected. In this manner, the SEGoand SEGe voltages of selected Segment can be set to Vss so that all LBLvoltages of on-going Segments' CACHE C_(LBL) capacitors can beimmediately saved after unexpected power-down but can be reused tocontinue the operations after Vdd being powered back within a certainidle time.

In an alternative embodiment, the present invention discloses apreferred Block decoder circuit as shown in FIG. 2A including a latchcircuit coupled with one pre-decoder with three inputs of Pi, Qj, and Skand a HV pump circuit with 64 separate inputs of XT1 to XT64, GSLp, andSSLp, a VHH input, and one set of corresponding outputs of WL1 to WL64,SSL, and GSL lines to control the word line gates and String-selectgates of a selected NAND Block. For m-page Program and Program-Verifyoperation, the latch is used to set an HXD node voltage to control bothcharging and latching of the corresponding voltages of each selected setof 64WLs, a SSL line, and a GSL line under different m-page concurrentoperations as summarized below:

-   -   a) HXD≧Vpgm+Vt during NAND SLC or MLC Program operation,    -   b) HXD≧Vread+Vt during NAND SLC or MLC Read operation,    -   c) HXD≧Vread+Vt during NAND SLC or MLC Program-Verify operation,    -   d) HXD≧Vread+Vt during NAND SLC or MLC N-page random        Erase-Verify operation,    -   e) HXD=Vss during 64WL+1SSL+1GSL voltage latching operation.

In another alternative embodiment the present invention discloses apreferred Block-decoder function configured to immediately set a HXDnode to Vss by setting one-shot pulse of ENB=Vdd for a preset durationwhen unintentional Vdd power lose is detected. In this manner, the oneset of 64WLs+1SSL+1GSL voltages of selected Block can be locked tocontinue the last operation. This can be done by quickly setting ENB toVdd and CLWL to Vss in accordance with the circuit of FIG. 2A.

In yet another alternative embodiment, the present invention disclosespreferred Cell's Vt assignments for 2 Vts of each SLC bit, 2 Vts of eachMLC MSB-bit, and 4 Vts of each MLC cell storing both MSB and LSB bits.The common B′-state defined a transient state in both 2-Vt SLC and 2-VtMSB bit preferably have common Vtb′min≧Vtamax for a larger ΔVt marginbetween each E-state erase cell and each B′-state program cell (as shownin FIG. 4).

In still another alternative embodiment, the present invention disclosesa preferred Command sets and Timing waveforms for m-page concurrentoperations. The concept is first adding m consecutive page Addressesfollowed by adding m SLC page data or 2 m MLC page data between theStart code and End code, where m≧1. The m-page Addresses and m-page SLCor 2 m-page MLC data can be separated in two separate commands with itsown start code and end code.

In yet still another alternative embodiment, the present discloses apreferred 1-bit data buffer (DB) circuit that is comprised of 1-bitMultiplier for a first amplification of small analog cell signal, 1-bitSense Amplifier (SA) for a second analog amplification to a full digitalsignal, 1-bit P/RB (Program/Read Buffer) for temporarily storing bitdata and 1-bit Program-check circuit.

In a specific embodiment, the present invention provides a method forperforming both m-page SLC and MLC (MSB page) pipeline data loading andm-page concurrent and pipeline Program as shown in FIG. 6A in accordancewith a HiNAND2 array (FIG. 1A) and its associated peripheral circuits of4 KB DB and 4 KB real CACHE. Up to m random-page 8 KB (All-BL) Programoperations can be performed with partially overlapping programming timeintervals in a concurrent/pipeline manner. For SLC and MLC MSB Programper cell, the number of cell-state increases from one initial erasestate to two states including an erase state (E-state) and a programstate defined as B′-state with a predetermined minimum Vt value nosmaller than a maximum Vt value for an A-state which is the lowestprogram state for a programmed MLC cell. Note, in a scheme of non-randomm-page 8 KB Program operation, m non-random pages can be performedconcurrently in fully overlapping time interval with m-fold reduction inprogram time. The definition of non-random m pages means all m selectedpages having a same physical WL address corresponding location in each64-cell String. For example, if WL31 is selected from one Block, thenall the remaining m-1 pages are also selected from same address of WL31in the remaining m-1 dispersed NAND Blocks. In this preferred flow, twopseudo 8 KB CACHEcel and CACHEint registers are required for storingtemporary 8 KB random or non-random page data.

In another specific embodiment, the present invention provides a methodfor performing both 2N-bit SLC and 2N-bit MLC (N-bit MSB Even and N-bitMSB Odd page) concurrent m-page All-BL Program-Verify as shown in FIG.6B and FIG. 6C in accordance with a HiNAND2 array (FIG. 1A) and itsassociated peripheral circuits of 4 KB DB and 4 KB real CACHE as well astwo pseudo CACHEs. All-BL Program-Verify operation can be performed upto m pages simultaneously in one cycle during 2N-bit LBL-precharge stepand 2N-bit LBL-discharge-and-Vinh-retaining step on page-by-page basisand performed sequentially in two cycles during N-bit GBL-LBLcharge-sharing step on half-page-by-half-page basis, becauseProgram-Verify has to be done in the commonly shared N-bit SA and P/RBwith 50% area reduction in data buffer.

In yet another specific embodiment, the present invention provides amethod for performing MLC (LSB) m-page data loading and m-page B′-bitadjustment as shown in FIG. 7A in accordance with a HiNAND2 array (FIG.1A) and its associated peripheral circuits of 4 KB DB and 4 KB realCACHE and a preferred 4-Vt MLC MSB and LSB assignments. Up to m pages 8KB B′-bit data adjustment on page-by-page basis before a m-page All-BLLSB Program operation can be performed in a concurrent/pipeline manner,regardless of m random-page or m non-random-page schemes. Up to fourpseudo CACHEs such as 8 KB CACHEcel, 8 KB CACHEint, 8 KB CACHEmsb, and 8KB CACHE1sb registers per LSB page are required for locally storing fourtemporary 8 KB random or non-random page data during B′-statebit-flipping purpose of each selected page or WL without using databuffer. B′-state bit-flipping step is required for correctly generatingfinal two higher MLC program states of a B-state and a C-state. Duringthe m-page All-BL MLC (LSB) Program operation, some cells of E-state areselectively programmed into A-state and some cells of B′-state areselectively programmed into B and C-states in accordance with the MLCLSB page data. The m pages are from total m Segments distributed in oneor more of J Groups.

In still another specific embodiment, the present invention provides amethod for performing m-page A-state Program-Verify operation as shownin FIG. 7B. Up to m pages A-state Program-Verify operations can beperformed with partially overlapping time intervals on page-by-pagebasis in a concurrent/pipeline manner, regardless of m random non-randompages due to the same reason of Data Buffer limited capacity for areasaving purpose. Note, only three pseudo 8 KB CACHEs are required such asCACHEcel, CACHEint, and CACHEmsb for A-state bit flipping purpose.CACHEcel is utilized for temporarily storing full page of currentlyupdated verify data, CACHEint for temporarily storing, and CACHEmsb forstoring retrieved MLC MSB page data. A-state Program-Verify operationusing Vtamin as a verify voltage includes transferring last updated datafrom CACHEint to two first storage nodes of P/RB per bit, transferringMLC MSB page data from CACHEmsb to SA per bit through Multiplier foramplification and then writing back to CACHEmsb in same data polarityand transferring into one second storage node of P/RB per bit, verifyingcurrently updated data from CACHEcel in SA per bit against data in boththe first store nodes and a second storage node of P/RB per bit based onVt distribution of A-state.

In yet still another specific embodiment, the present inventiondiscloses a method for performing m-page B-state Program-Verify as shownin FIG. 7C. Up to m pages B-state Program-Verify operations can beperformed simultaneously with partially overlapping time intervals onpage-by-page basis in a concurrent/pipeline manner, regardless of mrandom pages or m non-random pages due to the same reason of Data Bufferlimited capacity for area saving purpose. Note, only three pseudo 8 KBCACHEs are required such as CACHEcel, CACHEint, and CACHE1sb for B-statebit flipping purpose. B-state Program-Verify operation using Vtbmin as averify voltage includes transferring MLC LSB page data from CACHE1sb toSA per bit through Multiplier for amplification and then writing back toCACHE1sb in same data polarity and transferring into one second storenode of P/RB per bit, verifying currently updated data from CACHEcel inSA per bit against data in the first store nodes, transferred fromCACHEint during A-State verify, and data in second store node of P/RBper bit based on Vt distribution of B-state.

In yet still another specific embodiment, the present inventiondiscloses a method for performing m-page C-state Program-Verify as shownin FIG. 7D. Up to m pages C-state Program-Verify operations can beperformed simultaneously with partially overlapping time intervals onpage-by-page basis in a concurrent/pipeline manner, regardless of mrandom pages or m non-random pages due to the same reason of Data Bufferlimited capacity for area saving purpose. Note, only three pseudo 8 KBCACHEs are required such as CACHEcel, CACHEint, and CACHE1sb for C-statebit flipping purpose. C-state Program-Verify operation using Vtcmin as averify voltage includes verifying currently updated data from CACHEcelin SA per bit against only data in the first store nodes, transferredfrom CACHEint during A-State verify, based on Vt distribution ofC-state, updating data in CACHEcel and CACHEint, continuously performingnext iterative All-BL MLC LSB Program based on the updated data inCACHEcel and stop until the Program-Verify is passed.

In yet still another specific embodiment, the present inventiondiscloses a method for performing m-page MLC (LSB page) data loading andconcurrent B′-bit adjustment before m-page All-BL Program as shown inEven LSB page in FIG. 7E and Odd LSB page in FIG. 7F in accordance witha HiNAND2 array (FIG. 1A) and its associated peripheral circuits of 4 KBDB and 4 KB real CACHE, regardless of m random or non-random pages.Total four CACHEs are required such as CACHEcel, CACHEint, CACHE1sb, andCACHEmsb for B′-bit flipping purpose. A preferred m-page B′-bitAdjustment Flow is provided for a MLC physical cell that stores fourfinal logic states such as E, A, B, and C states but with five preferredinitial MLC states such as E, A, B1′, B2′, and C states, where eachfinal B-state is split into two initial B1′ and B2′ temporary states bysetting corresponding Vt values of Vtb1′min≦Vtb1′≦Vtb1′max, whereVtb1′max≦Vtb2 min=Vtbmin.

The goal of MLC B′-bit adjustment is to turn each LSB bit data with alogic pattern of 10110 for E, A, B1′, B2′, C initial state to fivedesired final logic pattern of 10010 for subsequent m-page LSB Programoperation as shown in FIG. 7G. In such manner, a MLC cell with a tighter4-Vt distribution can be achieved.

In yet still another specific embodiment, the present inventiondiscloses a method for performing m-page SLC or MLC Program-Verify. Upto m pages Even and Odd Program-Verify operations can be performedsimultaneously with partially overlapping time intervals on page-by-pagebasis in a concurrent/pipeline manner, regardless of m random pages or mnon-random pages but optionally the sequence of Program-Verify of Evenand Odd pages can be alternatively reversed once per iterative verifystep.

In an alternative embodiment, the present invention discloses a methodfor performing m-page All-BL SLC Read operation. Up to m pages SLC Evenand Odd All-BL Read operations can be performed simultaneously withpartially overlapping time interval on page-by-page basis in aconcurrent/pipeline manner, regardless of m random pages or m non-randompages. Totally one CACHEcel is required per page because no need of anybit flipping.

In another alternative embodiment, the present invention discloses amethod for performing m-page All-BL MLC (MSB page) Read operation. Up tom pages Even and Odd MLC (MSB) Read operations, with a condition thateach page Flag cell is assigned to 1, can be performed simultaneouslywith partially overlapping time intervals on page-by-page basis in aconcurrent/pipeline manner, regardless of m random pages or m non-randompages. The features of m-page MLC (MSB) Read operation include assigninga Flag cell to 1 to indicate that the addressed MLC-WL only stores 2-Vtof MLC MSB page data and each LSB bit data of a MLC cell is not storedyet. In this case, a MLC LSB bit data is set to “1”. i.e., LSB=1. Theneach flow of m-page MLC MSB Read operation has fewer steps. For thispreferred MLC (MSB) Read operation, only the CACHEcel is required forstoring temporary read data for distinguishing an initial programB′-state from erase E-state per cell.

In yet another alternative embodiment, the present invention discloses amethod for performing m-page MLC (LSB page) Read operation. Up to mpages Even and Odd MLC (LSB) Read operations, with condition that eachpage Flag cell is assigned to 0, can be performed simultaneously withpartially overlapping time intervals on page-by-page basis in aconcurrent/pipeline manner, regardless of m random pages or m non-randompages. The m-page MLC (LSB) Read is performed when the Flag cell isassigned to 0 to indicate the addressed MLC-cell storing 4-Vt of bothMSB and LSB bits. Total three pseudo CACHEs including CACHEcel,CACHEint, and CACHEmsb, are required for MLC LSB Read operation. Onepair of CACHEcel and CACHEint is tied to simultaneously store temporarydata for distinguishing E-state from A, B, and C states via VR1 (firstread voltage) read. Then, TIE signal is turned off to isolate CACHEintfrom CACHEcel. Next, CACHEcel is utilized for storing temporary data fordistinguishing E and A states from B and C states via VR2 (second readvoltage) read which is equivalent to MSB data. Then, this MSB data istransferred to the SA to I/O via CACHE register and write back toCACHEmsb. Next, CACHEcel is utilized for storing temporary data fordistinguishing E, A, B states from C state from VR3 (third read voltage)read. Data in CACHEint is restored to store nodes of the P/RB per bitand data in CACHEmsb is restored to SA per bit and further transferredto CAP1 and CAP2 of the P/RB per bit. Data in CACHEcel is restored to SAper bit. B-state cell data is flipped in polarity in P/RB per bit.Lastly, MLC LSB data is read from the P/RB.

Further, the present invention discloses a method for performing SLCBlock Read operation with only one CACHEcel.

Additionally, the present invention discloses a method for performingalternative m-page MLC Read operations with total three CACHEs such asCACHEcel, CACHEint, and CACHEmsb.

In an embodiment, the present invention discloses a m-page MLC Read Flowfor differentiating four final logic states, E, A, B, and C states, in aMLC physical cell. Each MLC MSB-bit logic read data is 0011 for fourrespective E, A, B, and C states, while each MLC LSB-bit logic read datapattern is 0101 for four respective E, A, B, and C states by flippingB-state=1 to B-state=0 to program C-state.

In another embodiment, the present invention discloses a sequentialload/latch (LD/LT) technique on ½-page by ½-page basis along with aplurality of latches incorporated in Even and Odd Segments-decoders andBlock-decoders to store and lock in m random-page 8 KB SLC, MLC MSB, andMLC LSB page data into m designated random 8 KB pseudo CACHE capacitorsconcurrently. As a result, only one 4 KB (½-page) data buffer (DB)rather than m×8 KB DBs is required to perform this preferred mrandom-page Block operations with a big saving in the peripheral area.The 8 KB page data stored in the pseudo CACHE include:

-   -   I. Externally-loaded page data: 8 KB MSB Vinh/Vss conversion        page data or 8 KB LSB Vinh/Vss conversion page data;    -   II. Internally-generated iterative Vinh/Vss conversion page        data: B′-adjustment 8 KB page data before 8 KB Program;    -   III. Internally-generated iterative Vinh/Vss conversion page        data: 8 KB MSB page data by VR1 reading from 8 KB CACHEcel that        stores 8 KB MSB conversion page data;    -   IV. Several internally-generated temporary iterative 8 KB        Vinh/Vss conversion page data: A-state, B′-state, B-state,        C-state iterative Program-Verify 8 KB page data;    -   V. Internally-generated 8 KB Vinh precharged data;    -   VI. Internally-generated 8 KB Vinh/Vss conversion data by Read        operation.

In yet another embodiment, the present invention discloses a method forperforming M-page concurrent operations such as 1) m randomly selectedFull-page Block SLC Program; 2) m randomly selected Full-page Block MLCMSB Program; 3) m randomly selected Full-page Block MLC LSB Program; 4)m random-page Block Full-page SLC Program-Verify; 5) m random-page BlockFull-page MLC MSB Program-Verify; 6) m random-page Block Full-page MLCLSB Program-Verify, 7) m random-page Block Full-page MLC MSB Read; 8) mrandom page Block Full-page MLC LSB Read; 9) m random-page BlockFull-page SLC Read; 10) m random-page Block Full-page CACHE Vdd/Vss toVinh/Vss conversion in accordance with each Vdd/Vss page data from 4 KBP/RB or 4 KB SA; 11) m random-page concurrent Full-page CACHE Vinhdischarging or retaining in accordance with SLC, MLC MSB and MLC LSBRead and Verify page data; 12) m randomly-selected Full-page CACHEC_(LBL) capacitors concurrent Vinh precharging by m randomly selectedmetal0 Segment power lines of LBLps in accordance with m selected randomSegment addresses; and 13) m randomly selected WLs, unselected WLs,selected SSL and GSL lines are precharged to Vpgm, VR, Vpass, V_(READ),Vdd, and Vss concurrently.

In still another embodiment, the present invention discloses a methodfor performing one Full-page-based 8 KB concurrent precharging ordischarging operation between one or more paired CACHE pseudo registersor C_(LBL) capacitors that are connected together by a transistor MLBLbwith a gate signal TIE. This is to provide more flexibility in fewersteps when more than one CACHE is selected for performing sameconcurrent operations in one cycle. Particularly, in same Read or Verifyoperation, one CACHE C_(LBL) is discharged by selected cells, the otherpaired CACHE's corresponding C_(LBL) capacitors will be discharged aswell without using the common sharing 4 KB GBLs. When TIE signal is setto Vinh+Vt, the precharged voltage is Vinh for 8 KB CACHEcel and 8 KBCACHEint; when TIE≧Vdd, the paired 8 KB CACHEcel and CACHEint isdischarged to Vss; when TIE is Vss, independent Vinh/Vss voltages are onboth 8 KB CACHEcel and 8 KB CACHEint.

In yet still another embodiment, the present invention discloses apreferred flow path of Segment or CACHE C_(LBL) for precharging inhibitvoltage Vinh from the selected corresponding LBLps line supplied by aVinh Driver (not shown). The Vinh voltage of LBLps line is set to beVdd≦Vinh≦10V. Each LBLps line is connected to one 20V NMOS buffer devicefor protecting the rest of Vinh Driver circuit during Erase operation,otherwise an erase voltage of 20V will couple from Triple P-well of NANDarray.

In yet still another embodiment, the present invention discloses that anumber (n1) of each DB is preferred to be smaller than a number (n2) ofphysical NAND cells residing in one physical WL. In other words, it ispreferred to have n1<n2. More particularly, n1=a×n2, where a=1/(2^(t))and t is any integer number equal to or greater than 1. Further, it ispreferred that a number (n4) of tight lower-level 2λ-pitch metal1 LBLsis more than a number (n3) of loose higher-level metal2 GBLs. In otherwords, it is preferred to have n3<n4. More particularly, n3=a×n4.

In an alternative embodiment, the present invention provides a flexibleinhibit voltage Vinh setting during voltage conversion of Vdd/Vss toVinh/Vss in C_(LBL) capacitors depending on locations of those HiNAND2Groups relative to DB. For regularly selected Groups, the voltage ofVinh can be flexibly set as Vdd≦Vinh≦−10V depending on degree ofcharge-sharing dilution. The Vinh=10V is determined by the NMOS BVDS ofMLBLps, MLBLp, MLBLs, MLBLb, MGBL, MG, and MS transistors. For thoseselected Groups near DB, less dilution due to charge-sharing betweeneach C_(LBL) and each C_(GBL) lines, thus lower Vinh of Vdd can be usedfor less precharging current consumption.

In yet another alternative embodiment, a preferred m-page concurrentNAND operation can be also performed in 1-level BL structure with aplurality of metal1 broken-GBLs. This 1-level BL structure is termed asa HiNAND1 array which can be applied in both 2D and 3D NAND using1-level metal tight 2λ-pitch 4 KB GBLs. Because of 1-level-only BLstructure, more GBL-divided MGBL transistors per one GBL are required.As a result, thus more complex and slower metal1 broken-GBL trafficcontrol are required. HiNAND1 array has less flexible m-page concurrentoperations than HiNAND2 array but only needs one level of metal GBL inNAND array for cost reduction.

In a specific embodiment, the present invention discloses a self-timedcontrol circuit and scheme for precharging m sets of 64WLs+1SSL+1GSLlines by a Vpgm WL Detector made of a 2-input Differential Amplifier(DA1) with one input connected to a dummy Vpgm WL and the other inputcoupled to Vpgm reference voltage. This dummy Vpgm WL voltage isinitially reset to Vss before m selected sets of 64WLs+1SSL+1GSL linesof m Blocks are selected for concurrent SLC or MLC program. Once theconcurrent m-page Program starts, then m randomly selected sets of64WLs+1SSL+1GSL lines will be precharged to corresponding desiredvoltages such as Vpgm, Vpass, and Vdd. Since Vpgm is the highest andslowest HV voltage for m randomly selected WLs, the dummy WL mimics mrandomly selected WLs to be charged with a Vpgm only during both SLC andMLC Program operations. Once Vpgm is detected by the DA, then theprecharging of for m selected sets of 64WLs+1SSL+1GSL lines would beautomatically stopped. This is referred as Vpgm-DA.

In another specific embodiment, the present invention discloses aself-timed control circuit and scheme for precharging m sets of64WL+1SSL+1GSL lines by a Vread WL Detector made of a 2-inputDifferential Amplifier (DA2) with one input connected to a dummy VreadWL and the other input coupled to Vread reference voltage. This isreferred as Vread-DA. This dummy Vread WL voltage is initially reset toVss before m selected sets of 64WLs+1SSL+1GSL lines of m Blocks areselected for concurrent m-page SLC or MLC Read or Program-Verify orErase-Verify. Once the concurrent m-page Read or Program-Verify orErase-Verify starts, then m randomly selected sets of 64WLs+1SSL+1GSLlines will be precharged to corresponding desired voltages such as Vreadand Vdd. Since Vread is the highest and slowest HV voltage for mrandomly selected WLs, the dummy Vread WL mimics m randomly selected WLsto be charged with a Vread only during both m-page SLC and MLCProgram-Verify or Erase-Verify or Read operations. Once Vread isdetected by the DA2, then the precharging of for m selected sets of64WLs+1SSL+1GSL lines would be automatically stopped.

In yet another specific embodiment, the present invention discloses aself-timed control circuit and scheme for precharging m sets of CACHEC_(LBL) to Vinh voltage using a Vinh Precharging Detector made of asimilar 2-input Differential Amplifier (DA3) with one input connected toa dummy Vread WL and the other input coupled to Vread reference voltageto detect the corresponding LBLps Vinh power line. During CACHE C_(LBL)precharging from Vss to Vinh, the selected LBLps line would be chargedby a Vinh Driver at one end and detected by the DA3 connected at anotherend. Once voltage of each LBLps line reaches Vinh, each correspondingDA3 will detect it and issues a signal to stop the Vinh prechargingoperation automatically. This is referred as VLBL-DA or LBLps-DA.

In still another specific embodiment, the present invention discloses aself-timed control circuit and scheme for discharging m sets of CACHEC_(LBL) lines by the same Vinh Precharging Detector DA3 but withreference connected to a discharge voltage set at 2.0V or lower. Once2.0V is detected, then the MLBLse and MLBLso transistors are immediatelyshut off to prevent the leakage between adjacent C_(LBL) capacitors whengate voltages PREo and PREe are set to 2.0V+Vt initially. Note, duringC_(LBL) capacitor discharging detection, the initial voltage of LBLpsline is precharged with a Vinh voltage left in the previous prechargeoperation without consuming power again.

In yet still another specific embodiment, the present inventiondiscloses a self-timed control circuit for setting one iterative programtime, Tpgm. One typical program time is about 250 μs for SLC, whichprograms only one A-state. But for each iterative program time is about25 μs if 10 successive iterative ISSP pulses are used for a tighterprogrammed Vt control. The Tpgm circuit can be triggered by VpgmDetector once the Vpgm voltage is reached per each ISSP program step.Note, since each iterative Vpgm voltage is gradually increased by ΔVpgmof 0.2V to 0.4V, thus the Vpgm Detector voltage has to be adjustedhigher accordingly. Each iterative Tpgm time of 25 μs can be designed byusing a simple RC-based delay with preferably a high10 Meg ohms MOS-typeresistor with a 2.5 pf MOS-type capacitor. This RC-based Tpgm adjustmentcan be made of trimming or empirical experiment data to finalize. Oncethe self-timed Tpgm is reached, then an immediate self-timed HVdischarging of 64WLs+1SSL+1GSL lines will be automatically executed bythe same Vpgm or Vread Detector that uses a value≦1.0V to replace Vpgmor Vread as Vref at dummy WL.

Further, the present invention discloses a preferred method to build atracking dummy WL layout for above Vpgm and Vread Detectors.Practically, in layout, at least three dummy WLs of same length as eachregular WL are laid out together to form one. The dummy WL used for Vpgmor Vread Detector is the middle WL with two adjacent WLs surrounding itso that the parasitic WL-WL adjacent capacitance and one WL resistancecan be counted to simulate the real worst-case WL delay. One end of themiddle dummy WL is connected to a Vpgm generator and the other end ofmiddle WL is connected to one input of a Vpgm or a Vread Detector. Usingthree dummy WLs is to save the circuit size because there are too manyBlocks in NAND array. It would take too much overhead if one dummy WLdetector is built per one Block.

Furthermore, an objective of the present invention is to add a pair ofgated capacitors CAP1 and CAP2 as a second storage nodes to a pair offirst storage nodes Qi and QiB for each P/RB so that an extra temporarystorage bits are created in a small area to allow more flexible andeffective MC MSB and LSB page concurrent Program-Verify and bit flippinglogic operations.

Alternatively, the present invention discloses a method for transferringthose failed m-page SLC or MLC Program pages of a first selection toanother already-erased pages of a second selection for continuing theSLC and MLC Program and Program-Verify operations. As such, eachpreferred m-page Program operation with m loaded page data can becompleted without failure so that no need of reloading those failed pagedata into the HiNAND2 array again. This can be done by mapping the newlyerased page Addresses into new correspondingly selected Segments andBlock-decoders latches. The Host or Flash Controller should have recordsto connect those Blocks being erased for continue programming on thosefailed pages.

Further alternatively, the present invention also discloses a method forperforming mixed m-page concurrent operations whenever no contentionhappens on those shared 4 KB metal2 C_(GBL) lines, metal1 8 KB C_(LBL)lines in pseudo 4 KB CAHCE registers, one set of 64XTs+1SSLp+1GSLp buslines and 4 KB real CACHE registers in accordance with HiNAND2 arrayshown in FIG. 1A. The concurrent m-page operations include m-pageconcurrent SLC Program operations on m Segments, m-page concurrent mixedSLC-page/MLC Program operations on m Segments, m-page concurrent SLCRead operations on m Segments, m-page concurrent MLC Read operations onm Segments, m-page concurrent mixed SLC Program operations in someSegments while MLC Read operations on other Segments, m-page concurrentmixed MLC Program operations in some Segments while SLC Read operationson other Segments, m-page concurrent mixed SLC Program operations insome Segments while MLC Verify operations on other Segments, m-pageconcurrent mixed MLC Program operations in some Segments while MLCVerify operations on other Segments, m-page concurrent mixed SLC Programoperations in some Segments while Erase-Verify operations on otherSegments, concurrent m-page mixed MLC Program operations in someSegments while Erase-Verify operations on other Segments, m-pageconcurrent mixed SLC Program data loading operations in some Segmentswhile Program-Verify operations on other Segments, m-page concurrentmixed MLC Program operations in some Segments while Erase-Verifyoperations on other Segments, m-page concurrent mixed SLC Readoperations in some Segments while MLC Program operations in otherSegments and MLC Program-Verify operations on some other Segments,m-page concurrent mixed MLC Program operations in some Segments whileSLC Read operations on some Segments and MLC Read operations in otherSegments, and m-page concurrent multiple mixed operations.

Still further, the present invention also discloses one commonlowest-level horizontal metal0 power line of LBLps per Segment disposedperpendicular to a plurality of 1-level higher metal1 LBLs and/or aplurality of 2-level higher metal2 GBLs through a plurality of NMOStransistors MLBLs with their common gate tied to a common signal PRE forperforming a plurality of concurrent m-page NAND operations. Each LBLpsline is configured for supplying a precharging or discharging current toa plurality of LBLs by each corresponding LBLps driver instead of aplurality of DB bits and for providing a flexible Vinh voltage rangingfrom Vdd to 10V. Vinh voltage is only limited by the device BVDS spec ofcorresponding precharge transistor MLBLs that is preferably made of asame device like NAND String-select transistor MG or MS.

Furthermore, embodiments of the present invention also are applicable toTLC, XLC-type of NAND array structures and even analog NAND array aslong as the pseudo CACHEs are used to store the temporary data withoutusing the real CACHEs. In other words, the on-chip novel pseudo CACHEscan store m-page of data in both digital form and analog formtemporarily with voltages ranging between Vss to Vinh which is onlylimited by the minimum BDVS of all NMOS devices connected to eachC_(LBL) node, Vinh pass-transistors and Vinh source.

4. BRIEF DESCRIPTION OF THE DRAWINGS

In following description, when 2N-bit is referred, it means that total 8KB physical NAND cells residing in one physical WL or Page. In thisapplication, 2N-bit means a full physical WL page of 8 KB cells.Thereby, N-bit means 4 KB which is ½ of one full physical page or ½ WLsize.

FIG. 1A is circuit diagram showing a preferred 2D HiNAND2 with 2-levelBL-hierarchical structure including 2N (8 KB) local bit lines (LBLs)pseudo CACHE capacitors connected to N-bit Data/Cache Register via Nglobal bit lines (GBLs) according to an embodiment of the presentinvention.

FIG. 1B is circuit diagram showing a preferred 2D HiNAND2 with 2-levelBL-hierarchical cell array including 2N (8 KB) local bit lines (LBLs)pseudo CACHE capacitors connected to N/2-bit Data/Cache Register via Nglobal bit lines (GBLs) according to an embodiment of the presentinvention.

FIG. 1C is a schematic diagram comparing L×K Consolidated Physical NANDBlocks within one a conventional NAND plane with L Dispersed Logic NANDBlocks according to an embodiment of the present invention.

FIG. 2A is a preferred Block-decoder circuit comprising a latch circuitwith a status check circuit PAS, a Pre-decoder circuit with three inputsof Pi, Qj, and Sk, and a Local HV Pump circuit to enable HV/LVconnections between each Block's Pre-decoder inputs of XT1-XT64, GSLp,and SSLp and the corresponding WL1-WL64, GSL, and SSL outputs of mselected Blocks of a HiNAND2 array and associated circuits of DB(Data-Buffer), CACHE register, and I/O Control during preferredmulti-page mixed SLC/MLC Program, Program-Verify, Read, Erase-Verifyoperations according to embodiments of the present invention.

FIG. 2B is a preferred Segment-decoder circuit comprising a latchcircuit with status check circuit SPAS, a Pre-decoder circuit with threeinputs of Ri, Tj, and Gk, and one Local HV Pump circuit to enableconnection of a HV or LV SEGp input to be coupled to a corresponding SEGoutput for properly operating a Segment of a HiNAND2 array andassociated circuits of DB, CACHE register, and I/O Control duringpreferred multi-page mixed SLC/MLC Program, Program-Verify, Read,Erase-Verify operations according embodiments of the present invention.

FIG. 3 is a block diagram of a N-bit Data Register circuit associatedwith the preferred HiNAND2 cell array in FIG. 1A along with one NMOSY-Pass circuit with YAi and YBj column decoder inputs and one Byte-wideI/O Control circuit to demonstrate both the desired SLC and MLCmulti-page operations of the present invention.

FIG. 4 is a diagram showing two Vt distributions of a 2-Vt SLC NAND cellor a 2-Vt MLC MSB cell and a 4-Vt MLC NAND cell containing both 2-Vt MSBbit and 2-Vt LSB bit used in the preferred hybrid HiNAND2 array of FIG.1A according to an embodiment of the present invention.

FIG. 5 is a diagram showing a preferred multi-page Read and ProgramCommand Timing Waveforms of a SLC, a MLC, or a mixed SLC/MLC hybridHiNAND2 array of the present invention.

FIGS. 6A-6C are flow charts showing multi-page SLC or MLC (MSB page)data loading, Program, and Program-Verify operations associated with aplurality of pseudo CACHEs in accordance with the HiNAND2 array shown inFIG. 1A, the Block-decoder shown in FIG. 2A, the Segment-decoder shownin FIG. 2B, the Data & CACHE Register shown in FIG. 3, and the MLCcell's 4-Vt assignment shown in FIG. 4 according to embodiments of thepresent invention.

FIGS. 7A-7D are schematic diagrams showing methodologies of multi-pageSLC, MLC MSB and MLC LSB in various Read, Program and Program-Verifyoperations and a plurality of pseudo CACHEs in accordance with theHiNAND2 array shown in FIG. 1A, the Block-decoder shown in FIG. 2A, theSegment-decoder shown in FIG. 2B, the DB shown in FIG. 3, and the MLCcell's 4-Vt assignment shown in FIG. 4 according to embodiments of thepresent invention.

FIGS. 7E-7G are flow charts and bias conditions for multi-page MLC (LSBPage) data loading and B′ Adjustment operations in accordance with theHiNAND2 array shown in FIG. 1A, the Block-decoder shown in FIG. 2A, theSegment-decoder shown in FIG. 2B, the Data & CACHE Register shown inFIG. 3, and the MLC cell's 4-Vt assignment shown in FIG. 4 according toembodiments of the present invention.

FIGS. 7H-7M are flow charts for multi-page MLC (LSB Page) A-stateProgram-Verify operation and B′ Adjustment, B-state Program-Verifyoperation, and C-state Program-Verify operation in accordance with theHiNAND2 array shown in FIG. 1A, the Block-decoder shown in FIG. 2A, theSegment-decoder shown in FIG. 2B, the Data & CACHE Register shown inFIG. 3, and the MLC cell's 4-Vt assignment shown in FIG. 4 according toembodiments of the present invention.

FIGS. 8A-8C are bias conditions for multi-page SLC/MLC (MSB Page) Readoperations in accordance with the HiNAND2 array shown in FIG. 1A, theBlock-decoder shown in FIG. 2A, the Segment-decoder shown in FIG. 2B,the Data & CACHE Register shown in FIG. 3, and the MLC cell's 4-Vtassignment shown in FIG. 4 according to embodiments of the presentinvention.

FIGS. 8D-8G are flow charts for multi-page SLC/MLC Read operations inaccordance with the HiNAND2 array shown in FIG. 1A, the Block-decodershown in FIG. 2A, the Segment-decoder shown in FIG. 2B, the Data & CACHERegister shown in FIG. 3, and the MLC cell's 4-Vt assignment shown inFIG. 4 according to embodiments of the present invention.

FIG. 8H is a table showing bias conditions for MLC Read of Even Pagewith Flag cell assigned to 0 associated with FIGS. 8A-8G according to anembodiment of the present invention.

FIG. 9A is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a Vpgm voltage by setting Vref=Vpgm inComparator with a full RC-delay tracking capability for the selected WLduring a self-timed concurrent/pipeline multiple-WL Program operationaccording to an embodiment of the present invention.

FIG. 9B is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a Vpass voltage by setting Vref=Vpass inComparator with a full RC-delay tracking capability for the selected WLduring a self-timed concurrent/pipeline multiple-WL Read and Verifyoperations according to an embodiment of the present invention.

FIG. 9C is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a VLBLps up to Vinh voltage for self-timedconcurrent/pipeline operations according to an embodiment of the presentinvention.

5. DETAILED DESCRIPTION

In the following detailed description of the present embodiments,reference is made to the accompanying drawings that forms a part hereof,and in which is shown, by way of illustration, specific embodiments inwhich the disclosure may be practiced. In the drawings, like numeralsdescribe substantially similar components throughout the several views.These embodiments are described in sufficient detail to enable thoseskilled in the art to practice the embodiments. Other embodiments may beutilized and structural, logical, and electrical changes may be madewithout departing from the scope of the present disclosure. Thefollowing detailed description, therefore, not to be taken in alimitation sense.

FIG. 1A shows an embodiment of a preferred 2D HiNAND2 array 200 with2-level, 2-metal top global metal2 GBL and bottom local metal1 LBLhierarchical Even and Odd numbered cell array that comprises J dividedGroups with option of equal or unequal sizes. In this example, wholeHiNAND2 array is being divided into J broken vertical Groups connectedby 4 KB top-level 4λ-width broken metal2 GBLs per two bottom-level 8 KBbroken metal1 2λ-width LBLs per Group. Note, λ is the minimum pitchwidth in NAND technology node. In FIG. 1A, each LBL is not furtherdivided for simplicity of explanation.

In an embodiment, each long metal2 GBL is divided into J broken GBLs byJ-1 GBL-divided NMOS MGBL transistors such as MGBL₁ to MGBL_(J-1) for JNAND Groups. The J-1 gates of J-1 MGBL transistors, MGBL₁ to MGBL_(J-1),in each GBL are separately connected to J-1 signals of DIVen[1] toDIVen[J-1] from Group 1 (201) to Group J (20j), respectively.Furthermore, the bottom-level 8 KB LBLs of each Group is further dividedinto L pairs of 4 KB Even LBLs and 4 KB Odd LBLs, each of themcorresponds to a C_(LBL) capacitor or a Segment. Here N=4 KB. All 8 KBLBLs have a line length identical to corresponding top metal2 brokenGBL. Each pair of neighboring Segments are connected via a row of bridgetransistors MLBLb commonly gated by a TIE signal. For total L Segmentsin each Group, there will be L/2 rows of such connection transistorsrespectively gated by TIE1 through TIEL/2 signals.

In this application, 8 KB and 2N-bit or 4 KB and N-bit are alternatelyused in this description and should be treated as the same. As explainedlater, each LBL or C_(LBL) capacitor is also referred as one Segmentmetal1 line or Segment parasitic metal1 capacitor or each pseudo CACHEregister. Each Segment is comprised of L NAND Blocks, which arevertically connected by 8 KB metal1 LBLs. In one or more embodiments,L=4 is used for illustrating purposes throughout the specificationalthough other numbers can be used without being limited in scope.

Next, each of L pair of 4 KB metal1 Even and Odd LBLs is connected toone shared metal2 GBL via one transistor MLBLpe and MLBLpo respectively.The bridge transistor MLBLb is preferably made with a BVDS≈7˜10V asString-select transistors MG and MS in each NAND String. The J-1 gatesof MGBL transistors, MGBL₁ to MGBL_(J-1), are separately connected toJ-1 signals of DIVen[1] to DIVen[J-1] from Group 1 to Group J,respectively. Each row has 4 KB MGBL transistors with 2N cells in onephysical WL of the HiNAND2 array. Note, each Group size can beoptionally made with different metal2 GBL length, thus different C_(GBL)capacitances. As a result, each metal1 C_(LBL) length and capacitancewill be made different accordingly. Each GBL length is identical to eachLBL length in layout if each LBL is not further divided.

Each group size (or each metal2 GBL or each metal1 LBL length) can bemade different for achieving a balanced Charge-sharing performancebetween each C_(LBL) capacitor and corresponding C_(GBL) capacitor inconjunction with the proper control over each GBL-divided transistor ofMGBL in on and off state. For example, in order to read analog data of 4KB cells from Group 1 to a N-bit 4 KB Data Cache & Register 700 locatedat top-end of the HiNAND2 array 200 with least signal degradation causedby each C_(LBL)/J×C_(GBL) charge-sharing effect, all J-1 M_(GBL)transistors between Group 1 and Group J have to be shut off by settingDIVen[1] to DIVen[J-1]=Vss. Thus Ratio of C_(LBL)/J×C_(GBL)=1 becauseJ=1, which provides a strongest cell analog signal to each Multiplier(to be shown below) of each Data Cache & Register 700. Conversely, whenreading analog data of 4 KB cells from Group J-1 to the N-bit 4 KB DataCache & Register 700 located at the top-end, the cell signal suffers thelargest degradation of 1/J by the largest Ratio of signal dilution ofC_(LBL)/J×C_(GBL) charge-sharing effect. In this case, all J-1 MGBLtransistors between Group 1 and Group J have to be turned on by settingDIVen[1] to DIVen[J-1]=Vdd to H1. In summary, more signal degradationdue to more signal dilution of LBL/GBL charge-sharing effect fromreading Group J than Group 1. In order to balance the Read and Verifycell signal level, Group 1 can be made with the shortest GBL/LBL length.Conversely, Group J can be made of longest GBL/LBL length.

In another approach, if Group 1 through Group J is made of a sameGBL/LBL length, then each LBL precharge voltage can be made different tobalance the sensed cell signal. For example, each C_(LBL) capacitors inGroup J can be precharged with a highest LBL voltage of Vinh and thendecreased progressively from Group J-1 to Group 1 to 1V of Vdd-Vt whereVdd=1.8V.

Referring to FIG. 1A, each preferred HiNAND2 Group (201 through 20j)further comprises L Segments, and each Segment further comprises KBlocks. Each NAND Block further comprises N vertical Strings and eachString has 64 NAND cells connected in series with their gatesrespectively connected to 64 horizontal poly2 WLs such as WL[1] toWL[64] and one top and bottom String-select NMOS transistors MS and MGwith their separate gates respectively tied to SSL and GSL poly2 lines.

In addition, each bottom end of each LBLe line and each LBLo line areconnected to each drain node of a paired pull-down NMOS transistors,MLBLse and MLBLso, with respective gates coupled to PREe1[1] andPREo1[1] and a common source line of LBLps1[1] in Segment 1 and PREe1[L]and PREo1[L] in Segment L and a common source coupled to LBLps1[L] inGroup 1 and PREeJ[1] and PREoJ[1] and a common source line of LBLpsJ[1]in Segment 1 and PREe1[L] and PREo1[L] in Segment L and a common sourcecoupled to LBLpsJ[L] in Group J.

Each HiNAND2 Group (201 through 20j) has L physical Segments and eachSegment has 8 KB metal1 C_(LBL) capacitors corresponding to 8 KB pseudoCACHE Registers. Each metal1 C_(LBL) capacitor is connected to K smallBlocks and each broken metal2 C_(GBL) capacitor is connected L pairs ofSegments.

One other major feature of Segment configuration is that C_(LBL)capacitors of two adjacent Segments are made as a pair, between which aNMOS bridge transistor MLBLb with its gate coupled to a correspondingTIE signal, is used to connect this paired C_(LBL) capacitors. WhenTIE≧Vinh+Vt to turn on MLBLb, then the charges of Vinh or Vss in bothadjacent CACHE C_(LBL) capacitors would be shared and a final balancedvoltage will be reached. This charge-sharing operation between twoadjacent Segment's C_(LBL) capacitors is a very useful technique whenone C_(LBL) is discharged due to that the selected cell is in conductionstate, then the paired C_(LBL) capacitors would also be discharged tothe same voltage level to save tedious steps to transfer read databetween two adjacent paired CACHEs.

Referring to FIG. 1A again, on top-end of the HiNAND2 array 200, oneN-bit real CACHE Register and one N-bit Data Register (D/R) are disposedthere as N-bit Data Cache & Register 700. All above mentioned Median HVtransistors of MGBL, MLBLse, MLBLso, MLBLpe and MLBLpo are preferablymade of at least same or higher BVDS NMOS devices than the NANDString-select NMOS device of MS or MG.

FIG. 1B shows another embodiment of a 2D HiNAND2 array 200′ with apreferred 2-level, 2-metal, BL-hierarchical cell array that comprisessimilar J isolated Groups 201′ through 20F with equal or unequal sizesdivided by J-1 rows of MHV MGBL transistors (˜10V) as explained above inFIG. 1A. Comparing with HiNAND2 array shown in FIG. 1A, this 2D HiNAND2array has similar 2N (8 KB) 2λ metal1 LBL lines or 2N (8 KB) C_(LBL)capacitors (or termed as 8 KB pseudo CACHE capacitors) per CACHE but N/2(4 KB) 8λ metal2 GBL lines or N/2 (4 KB) C_(GBL) capacitors, one N/2-bit(4 KB) DB and one N/2-bit (4 KB) real CACHE register.

As seen in FIG. 1A, the HiNAND2 array 200 has N metal2 GBLs associatedwith 2N NAND physical cells in one physical WL (or page), correspondingto 2N metal1 tight 2λ LBL lines and C_(LBL) capacitors and N pairedtransistors MLBLpo and N MLBLpe, MLBLso and MLBLse per one Group per onemetal2 broken GBL. Totally, the HiNAND2 array 200 has 4 KB N metal2broken GBLs with loose 4λ pitch size and 4 KB C_(GBL) capacitors perGroup and only the N metal2 broken GBLs in top Group 1 are connecteddirectly to corresponding a N-bit Data Cache & Register (or simply DataBuffer (DB)) 700 without going through any GBL-divided device such asMGBL transistors. The rest of J-1 Groups have to go through J-1 MGBLtransistors to connect to N-bit DB 700. For example, the C_(GBL)capacitor in Group 2 needs to go through MGBL₁ transistor and then viaC_(GBL) capacitor in Group 1 to connect to one bit of correspondingN-bit 4 KB DB 700.

In contrast, as seen in FIG. 1B, the HiNAND2 array 200′ is formed withonly N/2, 2 KB, metal2 GBLs with more loose 8λ pitch size and 2 KBC_(GBL) capacitors per Group in accordance with the same 2N physicalNAND cells, 2N8 KB metal1 LBLs corresponding to 2N 8 KB C_(LBL)capacitors in one physical page and WL. In other words, total number ofthe metal2 GBLs is only ¼ of total number of NAND cells per eachphysical page or WL. Each top-level 8λ-pitch metal2 C_(GBL) associatedwith each GBL is connected to four bottom-level metal1 2λ-pitch C_(LBL)capacitors associated with corresponding four LBLs (One top-level loose8λ metal2 GBL is connected to four bottom-level tight 2 k metal1 LBLs).As such, N MLBLpo, N MLBLpe, N MLBLso, and N MLBLse transistors in FIG.1A have been replaced by N/2 MLBLpa, N/2 MLBLpb, N/2 MLBLpc, and N/2MLBLpd and N/2 MLBLsa, N/2 MLBLsb, N/2 MLBLsc, and N/2 MLBLsd.

Similarly to FIG. 1A, one major feature of Segment configuration is thattwo C_(LBL) capacitors of adjacent Segments are made as a pair in thisFIG. 1B, between which a NMOS bridge transistor, MLBLb, with gatecoupled to a TIE signal is used to connect this paired C_(LBL)capacitors. When TIE≧Vinh+Vt to turn on the bridge transistor MLBLb, thecharges of Vinh or Vss in both adjacent capacitors would be shared and afinal balanced voltage will be reached. This charge-sharing operationbetween two adjacent Segment's C_(LBL) capacitors is a very usefultechnique when one C_(LBL) is selected for discharging, then the pairedC_(LBL) capacitors would also be discharged to the same voltage leveldue to the conduction state of MLBLb. With the bridge transistor MLBLbis fully turned on, the data stored in this paired C_(LBL) capacitorswould be same. As s result, both N-bit real CACHE Register and N-bit 4KB Data Register (DR) in DB 700 are reduced to N/2-bit 2 KB real CACHERegister and N/2-bit 2 KB DR in accordance with N/2 2 KB top-level loose8λ-pitch metal2 GBLs. Thereby, in FIG. 1B, the sizes of CACHE and DRhave been cut in half from those in FIG. 1A so that a big area saving inNAND peripheral circuit is achieved.

FIG. 1C shows a plurality of Consolidated Physical NAND Blocks in oneNAND plane of prior art as compared to a plurality of Dispersed LogicNAND Blocks in one NAND plane of the present invention. In conventionalNAND, one Physical Block having total 64 pages in is the minimum-unit ofan Erase size, but Program is performed on page-by-page basis 64 timesto complete 64-page Program and Program-Verify operation. Inconventional m-page Program sequence a strict rule needs to be followedby starting from the bottom page near the bottom String-elect transistorMG to the top page near the top String-select transistor MS. In theembodiment of this invention shown in right side of FIG. 1C, thepreferred HiNAND2 array scheme is comprised of 64 Dispersed physicalBlocks, based on which preferred m-page concurrent operations such asProgram, Program-Verify, Read and Erase-Verify operations can beperformed on m random pages in m random Segments in one or more Groups.More than one Segment in each Group can be selected for concurrentm-page operations with only one restriction that one Block is selectedper one selected Segment. In an embodiment, more than one pseudo 2N-bitCACHE registers are used during the whole course of the preferred m-pageoperations to store the temporary m-page data for both SLC and MLCoperations.

Within the conventional NAND scheme the minimum erase size is oneConsolidated Physical Block that contains 64 WLs or Pages. In contrast,for the HiNAND2 scheme with Dispersed Logic Blocks, the minimum erasesize is defined as one random page in one Physical Block per oneSegment. Total m selected pages for m-page operation are widelydistributed in one or more Segments in one and more Groups. If fourequal page-number Blocks per four Segments per Group are selected form-page Program operation, then m/4 Groups are required to performconcurrent m-page Program and Program-Verify operation. In thisinvention, non-equal random number of Segments per Group can be selectedfor the m-page Program and Program-Verify operations. For example, iftotal 64 pages are selected for a concurrent M-page Program operation,then four Segments can be selected in Group 1, six Segments in Group 2,five Segments in Group 3, one Segment in Group 4, and four Segments areselected respectively for the rest of Group 5 through Group16 for the64-page concurrent Program and Program-Verify operations.

The minimum size of performing Program, Erase, and Read operation isnormalized in one physical page or one physical WL per one Segment. Thepreferred m-page Block operation of the HiNAND2 array means m-page LogicBlock operations that select m dispersed WLs in m dispersed Segments inone or more Groups for concurrent operations to achieve m-fold reductionlatency in m-page operation.

In an embodiment, in case some pages of total m pages fail during theProgram operation, then there are two options to complete the requiredm-page concurrent Program operation. For example, if there are 3 pagesfailed in the 64-page Program, a first proposed option is that these 3pages will be assigned with 3 newly erased pages for continuing 3-pageconcurrent Program and Program-Verify operations. In other words, thefirst 64-page Program was collectively and concurrently on 64 selectedWLs. After a predetermined time, 3 pages are found with failed Program.Then these 3 failed pages will be assigned 3 newly erased WLs tocontinue the second 3-page concurrent Program. If the second 3-pageconcurrent Program is passed, then the whole 64-page Program isfinished. Otherwise, the process is continued until all 64 pages areprogrammed successfully. A second option is to combine the 3 failedpages with 61 new pages to make total 64 pages for a new 64-pageProgram. In other words, each Program is always performed in 64-pageunit to save m-fold latency. If there are no more new pages to beprogrammed, then the first option is the preferred choice.

Note, the multi-page Program size can be less than 64 pages. The 64-pageProgram merely gives maximum but not limited number of 64 pages that canbe performed concurrently according to embodiments of the presentinvention. Thus any smaller pages can also be performed concurrently tosave the program latency time.

In an alternative embodiment, a HiNAND1 array with one BL level isprovided with nLC 2D or 3D NAND chip design. The HiNAND1 array includesa plane of NAND cells physically divided into a plurality of Groups incolumn direction to allow the multi-page self-timed flexible concurrentand pipeline operations. Accordingly several peripheral circuitsincluding a data-register with at least 50% bit-reduction, Block andGroup decoders with self-timed control are provided.

Specifically, the plane of NAND cells in nLC design is formed into rowsof pages (WLs) and columns of bit lines (BLs). The columns are dividedinto a plurality of Groups by multiple rows of Group-divided devices.Each Group further comprises a plurality of Blocks and one dedicatedGroup's power line and each Block further comprises a plurality ofStrings in column direction arranged one by one in row direction. Withineach divided Group, all drain nodes of all Strings of all Blocks areconnected by a BL metal line laid in the column direction and all sourcenodes are connected to a common SL made of another metal or non metalline laid in the row direction. All BL metal lines of each divided Groupof HiNAND1 array act one row of capacitor-based CACHE register forindependently storing the program and read data for the self-timedm-page concurrent and pipeline operations such as Program,Program-Verify, Erase-Verify, and Read or mixed combination of aboveoperations. Although HiNAND1 array has less flexibility to load and readdata during which all different pages of m-page data has to be loaded inand read on in a “sequential” manner, the advantage for the wholeHiNAND1 array lies in the use of single metal line for all BLs to savethe manufacturing cost.

FIG. 2A shows one of the preferred Block-decoder circuit that comprisesan unique latch with a novel status check circuit PAS, a Pre-decoderwith three inputs of Pi, Qj, and Sk and one local HV Pump circuit toenable the plurality of HV/LV connections between each Block'sPre-decoder inputs of XT1˜XT64, GSLp and SSLp and the correspondingWL1˜WL64, GSL and SSL outputs of m selected Blocks of HiNAND2 array andthe associated circuits of DB, CACHE and I/O Control during thepreferred m-page mixed SLC/MLC Program, Program-Verify, Read,Erase-Verify operations, etc. As shown, an embodiment of a preferredBlock-decoder circuit 1000 that comprises a latch circuit 1010 with astatus check circuit 1040 with an output PAS, a Pre-decoder circuit 1030with three inputs of Pi, Qj, and Sk, and one Local HV Pump circuit 1020to enable the plurality of HV/LV connections between each Block'sPre-decoder inputs of XT1 through XT64, GSLp, and SSLp and thecorresponding WL1 through WL64, GSL, and SSL outputs of m selectedBlocks of the HiNAND2 array (FIG. 1A) and the associated circuits of DB,CACHE, and I/O Control during the preferred m-page Program,Program-Verify, Read, Erase-Verify operations, etc. As seen, a CLWLsignal coupled with a NAND4 device is used to clear or discharge HV lefton all 64 WLs, SSL, and GSL in m randomly selected Segments once eachBlock operation is verified successfully to reduce the WL-stress on mselected Blocks. Note, all above mentioned multi-page operations requireHV to be applied on selected 64 WLs, SSL, and GSL. A WLPH signal coupledto a NAND1 device is configured to dedicate a clock signal for theBlock-decoder's pump circuit 1020.

When 3 input addresses of Vdd, Pi=Qj=Sk=Vdd, are provided for theselected Block-decoder 1000, then XDM is set to Vdd is to enable thestatus check circuit 1040 by turning on NMOS transistor MN1 to couple aground voltage Vss. With XDM node being set to Vdd and a one-shot pulsebeing applied from ENS, the selected Segment latch (1010) is set to makeXDB node at Vss and XD node at Vdd. The latch circuit 1010 comprises oneINV3 and one INV4.

When the CLWL signal is Vdd and XD node is at Vdd, then XDP node becomesVdd, WLPH signal at 0V and HXD node becomes Vdd because MN10 is a nativeNMOS device with Vt˜0V. As a result, the pump circuit 1020 is stoppedand HV trapped in WL1 through WL64 and Vdd trapped on SSL and GSL aredischarged to Vss through the corresponding NMOS devices of MNH1 toMNH64, MNS2, and MNS3 if the selected Block passes the Program andProgram-Verify to reduce the WL stress on gates of NAND cells.

During the multi-page Program operation, the selected XDP node at Vdd toenable the local pump circuit 1020 so that HXD node voltage can reach tohigher than Vpgm+Vt to allow the full passage of Vpgm, and to set Vpassto the selected WLs and non-selected 63 WLs in m Blocks simultaneously.

Lastly, one more important function of this Block-decoder is thecapability of immediate response to set HXD node at Vdd and stop thelocal pump to discharge all latched HV and LV on m selected sets of64WLs+1SSL+1GSL when an unintended Vdd power loss is being detected.This can be easily done by setting the following conditions.

-   -   a) Setting CLWL signal to Vdd with time interval longer than 200        ns. No exact time control is necessary. The decline of Vdd will        control the discharge time automatically.    -   b) Setting WLPH signal to 0V so that Local pump is disabled.    -   c) Setting XT1˜XT64=SSLp=GSLp=0V    -   d) XD=Vdd for those selected Block decoders due to the Latch        setting in the beginning of Block operations. The Latch is made        of INV3 and INV4. Initially, when Pi, Qj, and Sk matched and ENS        applies one-shot of Vdd, then XDB node is set to 0 but XD node        is set to 1 (Vdd).    -   e) XDP=Vdd, thus HXD≈Vdd because MN6=NMOS Native device with        Vt≈0V. As such, the trapped HV on WL are being discharged over        power down time so that over-program would be eliminated. After        Vdd is being restored within a reason time of seconds, then the        trapped C_(LBL) SLC or MLC page data patterns can be used and        the unfinished of prior Program operations can be continued.

FIG. 2B shows an embodiment of a preferred Segment-decoder circuit 2000that comprises a latch circuit 2010 with two latches and a status checkcircuit 2040 with a S_PAS output, a Pre-decoder circuit 2030 with 3inputs of Ri, Tj, and Gk and one Local HV Pump circuit 2020 to enableconnection of one HV or LV input of either SEGpo or SEGpe can be coupledto the corresponding gate line SEGo or SEGe output for properlyoperating this preferred HiNAND2 array (FIG. 1A) with even and oddnumber and its associated circuits of DB, CACHE and I/O Control duringthe preferred m-page or page-based or mixed Program, Program-Verify,Read, Erase-Verify operations, etc. With this Segment-decoder and aboveBlock-decoder, the desired Vpgm, Vpass, Vread, and Vdd voltages of msets of 64-WLs, 1 SSL and 1 GSL lines can be independently latched onall selected WLs Block-by-Block so that the m-page concurrent pipelineProgram, Program-Verify, Erase-Verify, and Read concurrent operationscan be performed with up to m-fold reduction in operation time.

The symbols of i and j may not be same value for two Pre-decoder inputsof either Block-decoder 1000 or Segment-decoder 2000. They are usedherein to depict three possible Pre-decoders only. Another one output ofSegment decoder 2000 is Sk that is used as the third Pre-decoder inputof Block-decoder 1000 to enable the associated Block-decoder within aselected Segment.

During the charge-sharing period, the charges of Vinh stored of passedprogram cells in each metal1 C_(LBL) capacitor will be shared with onecorresponding metal2 large C_(GBL) capacitor. In order to have a fullpassage of Vinh from each odd and even C_(LBL) capacitor to each commonmetal2 C_(GBL) capacitor, the gate voltage SEGe or SEGo is charged up toVinh+Vt+ΔV at VHV port, where ΔV (<1V) is added to further reduceresistances of corresponding transistors MLBLpe and MLBLpo.

During C_(LBL) precharge cycle from Segment power line LBLps, gatesignals SEGo and SEGe have to set to shut off for preventing leakage.During the Vdd/Vss and Vinh/Vss conversion cycle, the SEGo=Vdd andSEGe=Vss or vice versa if V_(GBL)=Vdd/Vss in accordance with the pagedata. Note, odd or even number C_(LBL) is precharged at the same timebut Vdd/Vss and Vinh/Vss conversion is done 4 KB (½-page) each time.

FIG. 3 shows two bits of 2N-bit NAND cells in one page of HiNAND2 array200 sharing with one bit of N-bit DR (Data Register) 100 and one-bit ofreal CACHE Register 101 outside HiNAND2 array (200), one NMOS Ypasscircuit 300, with YAi and YBj column decoder inputs and one Byte-wideI/O Control circuit 500, to demonstrate the inventive concepts of thepresent invention.

In an embodiment, there are 2N-bit NAND cells connected in one physicalWL or page and 2N NAND strings with 2N drain nodes connected to 2Nmetal1 2λ-pitch LBLs in one NAND block that contains 2N metal1 tight 2λLBLs and C_(LBL) capacitors but N metal2 loose 4λ-pitch GBLs and C_(GBL)capacitors in HiNAND2 array 200 (FIG. 1A). Each bit of DR comprises1-bit Multiplier 102, 1-bit DRAM-like Sense Amplifier (SA) 104, and1-bit Program/Read Buffer (P/RB) 106, and 1-bit Program-Status checkcircuit 108 with following defined functions.

-   -   a) Multiplier (102): It is used to amplify the weak diluted        cell's analog signal originated within each selected small        CACHEcel C_(LBL) capacitor due to charge-sharing effect that        occurs along its sensed or read path by one or more large        C_(GBL) capacitors. The extent of cell's analog signal dilution        is depending on the location of the selected Group away from        each corresponding DB's SA circuit 104. This is termed as a        first analog amplification stage. For example, if the selected        cells of a selected physical page are in Group J's 2N-bit        CACHEcel C_(LBL) capacitors, then the charges of Vinh≧7V stored        in each Segment's small metal1 C_(LBL) capacitor and each        corresponding comparable metal2 Jth C_(GBL) capacitor jointly        will be diluted by J C_(GBL) capacitors from Group J to Group 1        with J-1 GBL-divided transistors of MGBL₁-MGBL_(J-1) being        turned on jointly by coupling J-1 DIV_EN signal to Vdd or higher        to provide a signal path between the selected cell in Jth Group        and a corresponding Multiplier in DR.    -   b) SA (104): This is a Latch-type SA circuit that can do a        second digital amplification after Multiplier's first analog        amplification. The amplified analog cell signal is presented at        OUTP node with an original cell's weak input analog signal        presented at each corresponding PBL node and is further        amplified by each corresponding SA to get a set of full digital        signals of Qi and QiB by switching T5 clock signal from Vss to        Vdd. In addition, there are two added Capacitors 1 and 2 with        different job assignments. The CAP1 is used to store the reverse        polarity of last MSB data bit that was being read out from each        corresponding NAND cell located in each CACHEcel and is then        latched here for the subsequent LSB bit's data evaluation.    -   c) P/RB (106): It is like a conventional Page-buffer to store        each old verified data bit for both SLC and MLC state evaluation        during each iterative Program and Program-Verify and        Erase-Verify operation performed in Block-mode.    -   d) Check circuit (108): This Page-program Check circuit is        performing Page-based program check. Whenever a new 2N-bit, 8        KB, Program-Verify page data is loaded, then all N-bit of        DiB=Vss, thus Di=Vdd to inhibit the corresponding N-bit cells        from further iterative program and V_(PASS)=Vdd.

Each metal2 4λ-pitch GBL is connected to the input of each correspondingLV Multiplier circuit through a 20V NMOS buffer transistor, MN6, withits gate tied to BIAS. One data-bit input from one I/O is coupled to onebit of P/RB through Y-pass NMOS transistor. Each P/RB output isconnected to each PBL node and then connected to each correspondingmetal2 GBL through same MN6. Along the read path starting from a NANDcell in a selected CACHEcel with a small metal C_(LBL) capacitor througha larger metal2 C_(GBL) capacitor to reach each corresponding SA, theanalog cell signal is being diluted in HiNAND2 array scheme. Therefore,each diluted read analog signal has to be loaded into each correspondingMultiplier to perform the first analog amplification and then by eachcorresponding Latch-type SA to do second amplification that is designedto perform analog to digital conversion. The full digital paired outputsof Qi and QiB of each SA are then coupled to each corresponding P/RBinputs for distinguishing the stored states of SLC and MLC with samecircuit but different steps. In other words, one preferred DB circuitfor the preferred m-page Read and Program operations of a hybrid SLC/MLCHiNAND2 array.

FIG. 4 is a diagram showing two Vt distributions of a 2-Vt SLC NAND cellor a 2-Vt MLC MSB cell and a 4-Vt MLC NAND cell containing both 2-Vt MSBbit and 2-Vt LSB bit used in the preferred hybrid HiNAND2 array of FIG.1A according to an embodiment of the present invention. Each SLC cell orMLC MSB bit has two Vt states such as E and B′ with VR1 of one selectedRead WL voltage, while each MLC cell has four Vt states such as E, A, Band C with three Verify voltages such as VR1, VR1 and VR3. The top graphdepicts two Vt distributions of each 2-Vt SLC NAND cell used in thepreferred hybrid HiNAND2 array (e.g., 200 of FIG. 1A). Each 2-Vt SLCNAND cell has two Vt states. The first one is an erased state termed asE-state that stores “1” data with a negative Vte distribution, rangingfrom Vtemin and Vtemax. The Vtemax value is preferably set to be −0.5Vas prior-art NAND typically. The second Vt is a programmed state termedas A-state that stores “0” data with a positive Vta value, ranging fromVtamin of +0.5V to Vtamax of 1.5V typically. Both Vtamin and Vtamax canbe shifted higher away from Vtemax for greater ΔVt margin betweenE-state and A-state.

As prior-art 2-Vt NAND, the desired SLC Program operation of HiNAND2array is to shift Vt higher from E-state to B′-state. Conversely, thedesired Erase operation is to shift Vt lower from B′-state to E-state.Since one physical SLC NAND cell stores 2 Vts, thus only one EraseE-state and one Program A-state, which is called one-pass Program.

Further, the bottom graph shows four Vt distributions of a MLC cellincluding E, A, B, and C states. There are three final desired Programstates such as A, B, and C but one intermediate Program state ofB′-state, which is generated during first MSB bit Program operation.

In an embodiment, this preferred MLC cell's program operation needs anErase operation first to set all MLC cells' initial Vt to be E-state.The three final desired Vt distributions are E-state with Vte<Vtemax(−0.5V), A-state with Vtamin (0.5V)<Vta<Vtamax (1.0V), B-state withVtbmin (1.5V)<Vtb<Vtbmax (2.0V) and C-state with Vtcmin(2.5V)<Vtc<Vtcmax (3.0V). The transient B′-state Vt is defined asVtb′min<Vtb′<Vtb′max. The value of Vtb′min is less than Vtbmin (1.5V)but the Vtb′max is preferably set to be Vtb′max≦Vtbmax. One physical MLCcell stores two logic bits with four final states such as one EraseE-state and three Program states of A, B, and C with B′-state in MSB bitbeing transferred to eventual B-state or C-state.

For MLC cell Program, the HiNAND2 cell and array uses a 2-pass MLCProgram scheme, which starts from a first-pass of MLC MSB bit Program,and then a second-pass of MLC LSB bit Program. Other MLC MSB and MLCProgram schemes and Vt code assignments can also be applied here toillustrate the concept of a preferred m-page Program but are omittedherein for description simplicity.

FIG. 5 shows a preferred multi-page Read Command Timing Waveforms of ahybrid HiNAND2 array in an embodiment of the present invention. It isknown that a conventional NAND's page-based Read Command begins with astart-code of a single page Read followed by few bytes of single pageaddress with one end-code lastly. The arrangement of single page addressis comprised of first few bytes of one row (for one WL) followed by fewbytes of columns of NAND memory array. All these byte numbers of rowsand columns of one page or WL are subject to the accessed NAND's arrayorganization and density, regardless of 2D or 3D NAND.

In contrast, FIG. 5 provides a hybrid HiNAND2 Read Command, it alsobegins with a start-code but is flexibly followed by one Block addressthat is comprised of N flexible page addresses placed in series incommand file with last one as a unique end-code, where m is set 1≦N≦M ifa Block is made of M-WL Strings.

Each single-page address comprises a first few bytes of one row (for oneWL) followed by few bytes of columns of NAND memory array. All thesebyte numbers of rows and columns of one page or WL are subject to theaccessed NAND's array organization, density, and memory type, regardlessof 2D or 3D NAND. In other words, the addressed pages can be either SLCor MLC types.

In certain embodiments, preferred flow charts and sets of the biasconditions for performing SLC, MLC MSB, and MLC LSB in Read, Program,and Program-Verify operations via a plurality of pseudo CACHEs areprovided in accordance with the HiNAND2 array shown in FIG. 1A, theBlock-decoder shown in FIG. 2A, the Segment-decoder shown in FIG. 2B,the Data Buffer shown in FIG. 3, and a MLC cell's 4-Vt assignment shownin FIG. 4. For example, in FIG. 3, each Data Register bit 100 comprisesone bit Multiplier 102, one bit of SA 104, and one bit P/RB 106, and oneProgram-Verify circuit 108. Further, in this application, the operationsof loading external SLC or MLC page data via 8 I/Os into Flash's realCACHEs (or Flash data) and outputting from 4 KB CAHE to a FlashController via the 8 I/Os are subject to NAND flash I/O pinconfiguration. In conventional parallel NAND Flash, 8 I/Os are commonlyused. But in conventional SPI 8-pin Serial NAND flash, 1 to 4 I/Os arecommonly used. In current or future parallel NAND Flash I/Oconfigurations, 16 or 32 I/Os may be popularly used. Regardless of anykind of I/Os, the present invention of multi-page operations can beapplied.

In the present application, a 8-I/O Parallel NAND with one physical WLof 8 KB, 2N-bit size HiNAND2 memory is used as an example to demonstratethe inventive concepts of this preferred multi-page operations withoutany limitation even including A/D and D/A bidirectional I/Os. Inaddition, embodiments of the present invention also include several newCommands to support the preferred multi-page operations such as MLC-WLand SLC-WL Read, Program, Erase, Program-Verify and Erase-Verify, etc.

FIG. 6A is a flow chart showing a method for performing SLC/MLC (MSBpage) multi-page Data Loading and Program according to an embodiment ofthe present invention. The flow deals with 8 KB 2-Vt SLC cells in oneselected SLC-WL or a 8 KB 2-Vt MLC MSB cells in one selected MLC-WLregarding 8 KB data loading into a plurality of CACHEs from 8 externalI/Os and 8 KB Program operation. Basically, this is m-page, 2-Vt SLC anda 2-Vt MLC's MSB page Data Loading and Program flow. Because both casesinvolve either a 2-Vt SLC cell or a 2-Vt MLC cell that only stores 2-VtMSB bit, thus each operation is treated as same and is combined hereinto one flow for a simpler explanation. Further, each 2-Vt SLC page or2-Vt MLC MSB page operation is divided into two symmetrical sub-flowssuch as a first multi-page Even SLC or Even MSB page operation and thena second Odd SLC or Odd MSB page operation. All SLC and MSB page dataare sequentially loaded into the designated 8 KB pseudo CACHE registersfrom 8 I/Os in unit of byte. For an 8 KB size of one physical WL (page)of the present invention, each 2-Vt SLC page or 2-Vt MLC MSB page isalso defined as 8 KB logic size. Thus each Even and Odd SLC or MLC pageis defined as a 4 KB physical page as well as 4 KB logic page. All SLCand MLC MSB page data are preferably loaded into 4 KB on-chip CACHEregisters via 4 KB metal2 GBLs, which are shared by 4 KB Even and 4 KBOdd local metal1 LBLs to save peripheral DB size and area.

Note, since the present invention is disclosed for a SLC/MLC WL-hybrid64-WL NAND Block of HiNAND2 array, thus when SLC and MLC page data areaddressed below, it is meant in one common Block with 32 SLC-WLs and 32MLC-WLs. As seen in FIG. 6A, the operation starts from Step 350.

Step 350: This step is to sequentially receive, load, and decode m-page2-Vt SLC Read and m-page 2-VtMLC MSB Program Commands and theirrespective m-page SLC and MLC Program Addresses in unit of byte viaNAND's 8 I/Os from an off-chip Flash controller to NAND's designatedCommand and Address Buffers (not shown). In addition, m latches of mselected Even and m selected Odd Segments and m Block-decoders are alsoset according to the m-page Addresses stored in m Address Buffers forconcurrent m-page SLC and MLC MSB Program operation.

For example, the Command is loaded into the designated Command registerso that this new SLC and MLC commands can be decoded and the associatedSLC or MLC Program operations can be initiated accordingly. Similarly,m-page Addresses are loaded into m designated on-chip Address Buffers inconjunction with other control circuits to set the corresponding latchesof m Even and Odd Segments as shown in FIG. 2B and m Block latches asshown in FIG. 2A of the preferred HiNAND2 array.

Moreover, m addressed 8 KB SLC or MLC MSB NAND page data are divided inm×4 KB Even-page data and m×4 KB Odd-page data. These m pages of SLC orMLC MSB data are selected concurrently by m Segment latches with m Blocklatches. With SLC and MLC MSB commands being proposed in NAND design, aflexible m-page Address arrangement is provided in the present inventionunlike the prior-art that allows only one page address of SLC or MLC MSBin one selected NAND plane and only one SLC or MLC MSB page Address isspecified in one common NAND Program commend. But with novel HiNAND2array (e.g., see FIG. 1A), m-page SLC or MLC MSB Addresses can beaddressed in every selected NAND plane. Thus a flexibility of up to mpages of Addresses can be specified in this novel SLC or MLC MSB Programcommand. Each page Address arrangement is like prior-art single pageAddress arrangement to place few bytes of column Address first followedby few bytes of row Address or vice versa. The major difference is thatm pages of Addresses can be cascadedly loaded between start and end ofM-page SLC and MLC MSB Program commend due to m-page Program of thepresent invention, rather than a single page SLC or MLC MSB Program inconventional NAND.

For example, SLC or MLC MSB first page's row Address is followed byfirst page's column Address and then the second page's row Address isfollowed by second page's column Address, and then lastly the mth page'srow Address is followed by mth page's column Address, and then the endcode. Note, since this preferred m-page SLC Read is still a Readoperation, thus no need to load any page data from the external FlashController into NAND flash as Program operation.

Step 351: This decision step is to check if m newly addressed Blocks andSegments are occupied by some existing concurrent operations? If yes,then the page loading is looped to wait until they are free andavailable for new setting and loading of SLC or MLC MSB page data. Ifnot, then the flow moves to Step 352.

Step 352: m latches of newly selected m Blocks in m Segments can be setselectively with a new status data in accordance with the circuits shownin FIG. 2A and FIG. 2B with the following preferred biased conditions:

-   -   a) CLWL=CLA=CLR=0V,    -   b) ENB=1=Vdd,    -   c) ENS=one-shot pulse of Vdd.

The one-shot pulse of Vdd applied to ENS will set each Block-decoder'slatch node XD to Vdd when the selected Block-decoder's 3 inputs of Pi,Qj, and Sk are matched. Each latch is made of INV3 and INV4 and itsoutput node XD is gated by CLWL signal.

After this step, then up to m pages of new 4 KB Even and Odd SLC and MLCMSB data will be sequentially loaded into on-chip 4 KB real CACHEregisters from external Flash controller via NAND's 8 I/Os as indicatedat Step 354 below.

Step 353: Basically, Step 353 is a preferred self-timed Prechargingoperation and can be done concurrently with Step 354 below.

Step 354 of both SLC and MLC MSB page loading takes a length 8K cyclesper logic page and the page data transferring only happens between 8I/Os and 4 KB real CACHE registers plus the m program page Addresseshave been loaded into Addresses Buffers, thus Step 353's 8 KB Even andodd LBL precharging of these selected m pages can be startedsimultaneously within Step 354.

Like prior art NAND SLC or MLC MSB ABL (all-BL) 8 KB page Program, ABLglobal long metal1 lines are precharged with ≈1.0V, this preferredm-page SLC and MLC MSB Program also requires ABL precharging but with ahigher Vinh up to 10V from a local common LBLps line into 8 KB short LBLlines within both CACHEcel and CACHEint pseudo registers for powersaving and superior Program-Inhibit. Note, in this m-page SLC or MLC MSB

Program, it needs two types of m CACHEcel and CACHEint registers of mmetal1 Segment C_(LBL) capacitors in one or more HiNAND2 Groups tocomplete the m-page Program.

The LBL precharging of Vinh can be performed like SLC Read operationthat requires a BL precharging with ≈1V of Vdd-Vt on long BLs in thebeginning of each single-page Read operation. But this preferred m-pageSLC Read operation only needs to precharge m pages of 1/(L×J) shorterLBL lines for less power consumption.

Besides charging Vinh for above said a superior Program-Inhibit, it isalso for a superior LBL sensing to get a larger analog signal of NANDcell after LBL and GBL charge-sharing during Program-Verify step. Thehigher Vinh voltage over Vdd=1.0V would guarantee more reliable sensingof NAND cell stored data and states.

Note, the selected 8 KB NAND cells per one physical WL are within oneselected Block of one selected Segment that comprises 8 KB pseudo CACHEC_(LBL) capacitors. Any pseudo CACHE register is termed as a CACHEcelwhen 8 KB selected cells of one selected full WL are within it.

On top of Step 353, there indicates one preferred set of bias conditionsin accordance with HiNAND2 array circuit shown in FIG. 1A:

-   -   a) CSL=SEGe=SEGo=0V,    -   b) PREe=PREo=H1,    -   c) LBLps=Vinh,    -   d) TIE1˜TIEL/2=DIV=0V (L=4): TIE1 through TIEL/2 being set to 0V        are to shut off the corresponding MLBLb NMOS bridge transistors        so that both 8 KB CACHEcel and CAHEint C_(LBL) capacitors are        independently precharged with Vinh from the selected LBLps=Vinh.        SEGo=SEGe=0V are to prevent one paired 4 KB Odd and 4 KB Even        C_(LBL) from leakage to one shared corresponding 4 KB metal2        GBLs. CSL=0V is a regular set up for a normal NAND string Read        operation.    -   e) PREo=PREe=H1>Vinh+Vt, this is to turn on both MLBLso and        MLBLse transistors so that Segment power supply of Vinh can be        fully coupled from selected LBLps lines to the selected 8 KB        CACHEcel's C_(LBL) capacitors without voltage drop.    -   f) LBLps=Vinh is supplied by a central Vinh MHV pump circuit        (not shown). The m 8 KB CACHEcel's Odd and Even C_(LBL)        precharge-time is controlled by on-chip State-machine design.

The 8 KB CACHEcel precharge time is controlled by a self-timed LBLpsVinh Detector circuit. This is done by using one shared LBLps line as aVinh power supply line as well as a Vinh sensing line. The Vinh supplycomes from one end of the LBLps line connected to a Vinh Driver but theVinh Detector circuit operates at another end of the LBLps line. Oncethe LBLps line reaching the Vinh voltage, it means that m 8 KB C_(LBL)capacitors in CACHEcel are fully charged with Vinh so that Vinh Detectorwill issue a signal to on-chip State-machine to stop Vinh prechargeoperation. This Vinh precharge-time thus can be very accurately andautomatically controlled by the present invention in accordance withcircuit explanation shown in FIG. 9C (to be seen below).

Step 354: This step is to sequentially load either 8 KB SLC or MLC MSBpage data into 4 KB real CACEH registers. Each page data is divided into4 KB Even and 4 KB Odd pages to accommodate for 4 KB CACHEs, 101, shownin FIG. 3 and 4 KB global metal2 GBLs shown in FIG. 1A for area saving.The bias conditions for page data loading are set forth in accordancewith the HiNAND2 array circuit and CACHE Register.

Step 355: This decision step is to check if the last byte of theexternal 8 KB SLC and MLC MSB page is completely loaded? If No, then itis looped to wait for the completion of each whole 8 KB page dataloading. If Yes, then the flow moves to Step 357 to do more GBLavailability check and to set RDY signal by pulling it low in a mannerof one-shot pulse and informing off-chip Flash Controller that the NANDis entering a busy state. No more SLC or MLC MSB page loading from 8I/Os to 4 KB CAHCE is allowed at this period.

Step 356: Before 4 KB GBL bus lines are released from any otherconcurrent operations, the GBLs are being occupied. Thus, NAND chip willgenerate one-shot RDY signal so that Flash Controller will not forwardany new page data into the real 4 KB CACHE register because its lastpage data is still in the real CACHE.

Step 357: This is a decision step to check if the common 4 KB metal2 GBLbus lines occupied by some existing concurrent operations? Once 4 KBmetal2 GBL bus lines are free, then the flow moves to Step 358.

Step 358: This step is to write the externally loaded 4 KB CACHE 4 KBEven data into two selected Even 4 KB CACHEcel and 4 KB Even CACHEintregisters with same data polarity and same initial Vinh LBL prechargedvoltage. This step is to convert the digital Vdd/Vss SLC or MLC MSB pagebit pattern in 4 KB real CACHE to a MHV analog Vinh/Vss bit pattern in 4KB CACHEcel and 4 KB CACHEint pseudo Registers for superiorProgram-Inhibit and Program voltage for subsequent SLC or MLC MSBProgram operation. Whenever each Even CACHECcel and Even CACHECint ofVinh voltage is coupled to Vss, it would be discharged to Vss. On thecontrary, whenever each Even CACHECint and Even CACHECint of Vinhvoltage is coupled to Vdd, Vinh would be retained when gates of MLBLsoand MLBLse transistors are coupled to Vdd as seen in FIG. 1A inaccordance with the following bias conditions:

-   -   a) CSL=TIE1˜TIEL/2=SEGo=0V (L=4),    -   b) SEGe=1 for both CACHEcel and CACHEint registers,    -   c) DIVen=BIAS=LD=H1 one-shot pulse.

Step 359: Once above 4 KB Even SLC or MLC MSB page data is latched witha converted Vinh/Vss analog data in two designated 4 KB Even CACHEceland 4 KB Even CACHEint, then state-machine will pull down the RDY pin toinform off-chip Controller that the 4 KB real CAHE register areavailable to be loaded by new Odd page data.

Steps of 360-365: These six steps for Odd page are substantially similarto above six steps of 354-359 for Even page loading in real CACHE andtwo pseudo CAHCEcel and CACHEint to sequentially load the remaining 4 KBOdd SLC and MLS MSB page data from 8 I/Os with the same biasedcondition. These steps will take another lengthy 4K cycles to complete 4KB Odd page loading between 8 I/Os and 4 KB real CACHE registers.

After Step 365, whole 8 KB SLC or MLC MSB pages have been completelyloaded into the final designated 8 KB pseudo CACHEcel and 8 KB CACHEintC_(LBL) capacitors temporarily. And the off-chip Flash Controller isinformed by a one-shot RDY signal of NAND for the free 4 KB real CACHEregister to receive new operational Command.

Step 366: This decision step is to check if the last page of allselected m pages for a Block SLC or MLC MSB Program being fully loadedinto HiNAND2 array. If Yes, then the flow moves to Step 367. If No, thenthe flow moves to Step 368 and Step 369 in parallel. Step 368 is loopedto wait for the completion of the last 8 KB page data loading so thatHost will issue next selected address and command as well as load newpage data.

Step 367: This confirmation step is to recognize the receipt of thism-page Program Confirmation code so that the next Step 369 of Preferredm-page SLC or MLC MSB Program can be started immediately once whole mpages' data being loading completely to the m designated 8 KB CACHEintand CACHEcel with preferred Program-Inhibit conversion of Vinh/Vss.

Step 368: Host notices the availability of real CACHE of NAND, thus theHost will continue issuing a new page Address, Command, and Page datafor next m-page SLC and MLC MSB Program and repeat operations from Step350 again of the loading of remaining pages.

Step 369: Once all m page data being stored in m designated CACHEcel andCACHEint C_(LBL) pseudo registers, then the next step is to set up andlatch the right Vpgm, Vpass, and Vdd voltages for m selected sets of 64WLs and SSL and GSL lines before the m-page SLC or MLC MSB concurrentall-BL Program.

Since each set of 64WLs+1SSL+1GSL voltages are respectively coupled from64 XTs, a SSLp and a GSLp, thus the setup of each set of 64WLs+1SSL+1GSLhave to be done on set-by-set basis. Particularly, the m-page SLC or MLCMSB Program is preferably done on m random WLs or pages for achievinghighest NAND file system manipulation. Conventionally, m sets of64WLs+1SSL+1GSL are impossible for random WLs. But under the option of mnon-random page Program, then m sets of 64WLs+1SSL+1GSL can be set andlatched in one-cycle. For the present invention, the m-page SLC and MLCMSB Program scheme work for two options, e.g., m random page and mnon-random page SLC and MLC MSB Program.

In a specific embodiment, m random-page Program is performed. Note, theWL program voltages setup and latching on m selected parasitic poly2capacitors are the self-timed operations for both steps. The programvoltage setup means to precharge one selected WL with Vpgm (15V-25V), 63non-selected WLs with Vpass (8V-10V), one SSL=H1≧Vinh+Vt and one GSL=Vssfor one Block. The WL setup means to apply the desired WL voltages ofVpgm, Vpass, H1, and Vss to XT1-XT64, SSLp, and GSLp common shared buslines as a first step. Then the m selected page Addresses will enableand pump each local Block-decoder to allow the full Vpgm, Vpass, H1voltages to the designated one set of 64WLs+1SSL+1GSL. The local HXDnode voltage has to be pumped up to a value more than Vpgm+Vt inaccordance with the preferred circuit of Block-decoder (FIG. 2A). TheVpgm voltage and time control is automatically done by the Vpgm Detectorcircuit set at one end of dummy WLs. The setup conditions between 64XTs, 1 SSLp, and 1 GSLp and 64 WLs, 1 SSL, and 1 GSL lines and HXD nodeare summarized below in Table 1. The HXD signal plays an importantbridge role between above 64XTs+1SSLp+1GSLp and 64WLs+1SSL+1GSL for eachselected Block. When HXD node is pumped to Vpgm+Vt, then Vpgm, Vpass, H1precharging happen on the corresponding 64WLs+1SSL+1GSL by set of commonbus lines 64XTs+1SSLp+1GSLp. Conversely, when HXD node is set to Vss bya Vpgm Detector with a Block-latch status circuit, then Vpgm, Vpass, H1precharged voltage would be latched on the corresponding large parasiticpoly2-capacitors of respective lines of 64WLs+1SSL+1GSL for a long time.Then the page Program starts. After one self-timed iterative ISSPProgram operation of around 10 μs-20 μs, HXD node will be turned on witha Vdd again by each Vpgm time Control circuit per one Segment.

TABLE 1 one set of 64WLs + 1SSL + 1GSL Program voltage setup & latching1 WL(sel) = Vpgm XT(sel) = Vpgm 63 WLs(un-sel) = Vpass 63 XT (un-sel) =Vpass 1 SSL = H1 SSLp = H1 1 GSL = Vss GSLp = Vss HXD with matched Pi,Qj, Vpgm + Vt when setting but Sk to enable local pump Vss when latchingand latch

Whenever after each iterative Program step, the HXD node is set at Vddand 64XTs=1SSLp=1GSLp are set to Vss, then Vpgm, Vpass, and H1 will bedischarged accordingly, thus 64WLs=1SSL=1GSL=Vss. After WL HV discharge,the WL HV stress is removed for longevity of cell P/E cycle. When the msets of voltages of 64WLs+1SSL+1GSL are discharged to Vss detected by aVpgm Detector, the operation mode is switched to iterative m-pageconcurrent Program-Verify operation.

The 2-Vt Program-Verify voltage for both MLC MSB and SLC is a commonvalue of Vtb′min but for 4-Vt MLC's MSB and LSB Program-Verify Vvfyvoltages require more such as Vtamin, Vtbmin, and Vtcmin. Table 2 belowshows how one set of Program-Verify voltages are set up between each setof 64XTs+1SSLp+1GSLp and each corresponding set of 64WLs+1SSL+1GSL. Inan embodiment of the present invention, each setup between64XTs+1SSLp+1GSLp and 64WLs+1SSL+1GSL is done on one set by one set dueto that 64XTs+1SSLp+1GSLp are shared by m random selected sets of64WLs+1SSL+1GSL. Thus for m random pages SLC or MSB Program operation,total Program and Program-Verify steps will take m cycles respectively.

TABLE 2 one set of 64WLs + 1SSL + 1GSL Program-Verify voltage setup &latching 1 WL(sel) = Vvfy 1 XT(sel) = Vvfy 63 WLs(un-sel) = Vread 63XTs(un-sel) = Vread 1 SSL = Vdd 1 SSLp = Vdd 1 GSL = Vread 1 GSLp =Vread HXD with matched Pi, Qj, Vread + Vt when setting but Vss Sk toenable local pump when latching and latch

As explained above, if each selected WL location in each correspondingBlock with 64WLs+1SSL+1GSL is the same, then both Program andProgram-Verify voltages setup of m selected sets of non-random64WLs+1SSL+1GSL can be done in one cycle. This is a tremendous power andtime saving by HiNAND2 array scheme and its associated m-page operationsand methodologies.

The latching of above Program and Program-Verify voltages on theselected 64WLs+1SSL+1GSL in accordance with Table 1 and Table 2 isconfigured to set each selected HXD node at Vss after full precharge. Asa result, the voltages of the common 64XTs+1SSLp+1GSLp become“Don't-care” thus 64XTs+1SSLp+1GSLp are released for other new m-pageconcurrent NAND operations that may be urgently interrupted by FlashController during this specific interval.

Step 370: This step is to latch the desired Vpgm, Vpass, and H1 voltageinto the selected Block-decoders. This is done by a Vpgm Detector whichis connected to a dummy WL with two adjacent dummy WLs to have sameparasitic WL-WL capacitance as a regular 64-cell Block. In this manner,the true Vpgm precharge time will track the selected WL charged by thesame Vpgm by tracking WL resistance and capacitance.

Once Vpgm is detected, no need to detect Vpass and H1 because Vpgm isthe highest voltage and the slowest HV signal in WLs during Program. TheVpgm Detector will issue a signal to inform the selected Block todischarge the corresponding HXD node to Vss so that the fully-charged64WLs+1SSL+1GSL can be latched without leakage during the subsequentProgram operation. As a result, the m-page Program voltage and time canbe accurately and automatically initiated and counted securely. The biasconditions are listed below.

-   -   a) CLWL=0V,    -   b) CLA=CLR=ENS=0V,    -   c) ENB=one-shot of negative pulse.

Note, the precharge setup and latching of one set of 64WLs+1SSL+1GSLpoly lines are initiated at different time for m random page BlockProgram as explained above. But if each of m-page selected WLs are inthe same locations in the 64-cell NAND String, then this becomes anon-random m-page Program. Therefore, 64 sets of 64WLs+1SSL+1GSL can beset at one time for m-fold saving in precharging and latching thedesired Vpgm, Vpass, and H1.

Step 371: This step indicates all m random or non-random SLC or MLS MSBpages are selected for a concurrent m-page Program. For m non-randompage Program case, the program time would have m-fold reduction becausem Program operations are started at the same timeline. Contrary, for mrandom-page Program case, each random page will be initiated by eachVpgm Detector at different timeline for each selected Block.Practically, m random-page Program time may have an overtime period.Thus the Program time reduction is still realized for highestflexibility in NAND design.

Note, each SLC and MLC page program may averagely take about 250 μs foreach page of SLC and MLC MSB Program. But each Program is divided into aplurality of ISPP pulse program with duration of around 15 μs-20 μs.Thus each self-timed Vpgm control time is meant each iterative ISSPprogram time of above said 15 μs-20 μs, rather than a whole 250 μs.Thereby, each shorter ISSP program time is easier to be implemented inthe present invention with a smaller RC devices on-chip.

Step 372: This decision step is to check if all 64 XTs, 1 SSLp and 1GSLp bus lines are free. In other words, are they being occupied now? Ifthe response is “No”, then the flow moves to Step 373. If yes, then Step372 is looped to wait for the release of above said bus lines for nextdesired operation that will use these bus lines. From Step 272, the flowsplits into two paths. One path moves to Step 375 and the other pathmoves to Step 373.

Step 373: Again, Step 373 is a preferred self-timed Prechargingoperation and can be done concurrently with Step 375. This step is toprepare for the subsequent Program-Verify operation after each ISSPiterative program step. As always, prior to each Program-Verify step,the selected 8 KB LBLs have to be precharged first. For this step, oneof the m selected 8 KB LBLs are within one 8 KB CACHEcel and thepreferred LBL voltage is Vinh for achieving a superior Program-Inhibitand a larger analog cell voltage for more reliable and efficientsubsequent LBL sensing. The preferred bias conditions are explainedbelow.

-   -   a) DIVen=0, LBLps=Vinh,    -   b) TIE1˜TIEL/2=CSL=SEGe=SEGo=0V (L=4),        -   These conditions are to ensure the CACHEcel precharge would            not leak to top-level GBLs and the adjacent paired CACHEint            C_(LBL) capacitors.    -   c) PREo=PREe=H1>Vinh+Vt        -   These conditions are to connect the LBLps line Vinh supply            to 8 KB CACHEcel C_(LBL) capacitors concurrently.

Step 374: Once the desired voltages being well set up in all selected 8KB CACHEcel C_(LBL) capacitors after each ISSP program step, then theProgram-Verify voltage on 64WLs+1SSL+1GSL lines will be initiated inthis step in accordance with the conditions indicated in Table 2 above.This step is a self-timed operation, which is automatically controlledby a Vread Detector circuit, which uses the same circuit of VpgmDetector but the reference voltage of Vpgm is replaced by Vread only.The Vread voltage is the highest voltage and the slowest WL signalduring SLC and MLC MSB Program-Verify step.

Step 373 and Step 374 are the lengthy steps taking more than 10 μs and 3μs respectively. Although the timeline to initiate each page Program isnot overlapping, but because each ISSP steps takes longer than the WLsetup time, thus it results in m-page SLC and MLC MSB Program of thepresent invention will have overlapping time interval. As severalsuccess ISSP steps precede, numbers of overlapping time intervals forconcurrent Program and Program-Verify become high. In other words, somepages are engaging in the Program step but some other pages maybeindependently engaging in the Program-Verify step concurrently or viseversa.

Under a scenario of extremely busy Program and Program-Verify withmultiple tasks being executed simultaneously, embodiments of the presentinvention still allows additional urgently requested operationsinitiated by the external Flash Controller as long as no bus contentionhappening on the common GBL bus lines, 64 XTs bus lines, and the common4 KB real CACHE registers.

FIG. 6B is a flow chart showing a method for performing m-page SLC/MLC(MSB Even Page) Program-Verify operations according to an embodiment ofthe present invention. As shown, this flow is mainly designed for thepreferred m-page 2-Vt SLC or 2-Vt MSB Program-Verify step, while themethod flow in FIG. 6A is mainly designed for both m-page 2-Vt SLC and2-Vt MSB Program operation.

Step 375: This step is to restore previously latched m pages of 8 KB SLCor 8 KB MLC MSB page-bit data in 8 KB CACHEint C_(LBL) capacitors backto 4 KB P/RB via 4 KB limited GBLs, then 4 KB Multiplier, and then 4 KBSA in unit of first 4 KB Even page data followed by second 4 KB Odd pagedata. These restored SLC and MLC MSB original page data are used for theProgram-Verify step that needs to compare the newly retrieved 8 KB cellspage data over 8 KB original Program page data loaded from externaloff-chip Flash Controller via 8 I/Os.

Now, one of m 4 KB SLC or MLC MSB Even page analog data latched in oneof 4 KB Even CACHEint at Step 353 in FIG. 6A would be sequentiallysensed and amplified by both 4 KB Multipliers and 4 KB SAs to perform2-step analog amplifications and the final 4 KB fully amplified Evendigital data would be stored in 4 KB SAs on 4 KB ½-page by ½-page basisdue to the limitation of 4 KB metal2 GBL bus lines. Each readout of SLCor MLC MSB data bit of Qi=0/1 at each SA is set by each correspondingC_(LBL)=Vss/Vinh with same polarity. This step of 4 KB Even page digitaldata can be done in 1-cycle through LBL and GBL charge-sharing and afirst amplification through a Multiplier and a second amplificationthrough SA with the following preferred bias conditions:

-   -   a) DIVen=SEGe (in CACHEcel)=H1,        -   This is to connect the broken GBL line's MGBL transistors to            provide a way to connect each sensed but diluted C_(LBL)            voltage to each corresponding Multiplier via the charge            sharing between with each corresponding GBL line;    -   b) TIE1˜TIEL/2=CSL=SEGo=SEGe (in CACHEcel)=0 (L=4),        -   This is to shut off leakage path through MLBLb with its gate            tied to TIE1 through TIEL/2 between each paired CACHEcel and            CACHEint C_(LBL) capacitors so that the sensed analog cell            signal at CACHEint would not be diluted between paired            CACHEcel's capacitors.    -   c) Voutp (high/low)=Vref+/−ΔV        -   This Vref set up is flexibly done by setting the reference            voltage Vref between higher Vref+ΔV but moving Vref−ΔV as            discharge continues over time before reaching the final            value of Vss when E-cell is selected. The SA comparison does            not need to wait for Vss on LBL lines for the combined            Multiplier and SA operation.    -   d) T5=one reverse one-shot pulse of Vdd        -   T5 clock is used to do the second analog amplification of            the first analog amplification done by Multiplier and            finally latches the fully amplified digital cell data at SA            Qi and QiB nodes.

Step 376: This step is to transfer each 4 KB restored original SLC orMSB even page bit data to 4 KB P/RB via 4 KB SA in 1-cycle with thefollowing bias conditions in accordance with DB circuit shown in FIG. 3.

-   -   a) ENSB1=ENSB2=0V,    -   b) PGM=0, because not in program mode.    -   c) IDB=IDAB=0V. This is to disconnect Transistor 8 and 6 to        disable signals stored in MLSB and MLSBB.    -   d) IDC=WBK=one shot of Vdd, As seen in P/RB circuit, Qi is        connected to gate of Transistor 18 and QiB is connected to gate        of Transistor 17. But WBK is a connected to gate of Transistor        16 only. The one shot for both IDC and WBK has to be synchronous        in design. As a result, Qi=1/0 will result in Di=1/0 in same        phase.    -   e) T5=1 is used to enable SA.

After this step, then the bit-flipping in each iterative program andprogram steps by subsequently retrieved Even page data from Step 378 to383 on original SLC or MSB page data can be performed between eachpaired SA and P/RB.

Steps of 378 and 379: These two steps are designed to set up and lock inthe desired Program-Verify voltages for 64WLs+1SSL+1GSL lines inaccordance with the preferred voltages shown in Table 2. With thefollowing bias condition:

-   -   a) CLWL=CLA=0V,    -   b) CLR=ENS=0V,    -   c) ENB=one shot of from 1→0

In order to achieve more accurate and secure Program-Verify WL voltageand time control for each independent sets of 64 WLs, SSL, and GSL, thisinvention uses three dummy WLs with exactly identical layout and lengthof a regular WL but only the middle dummy WL is used for VreadDetector's tracking purpose. The reason to have two extra adjacentun-used dummy WLs is to ensure same parasitic inter-WL capacitance arecounted into the precharge-time calculation.

The Vread Detector is made of one 2-input Differential Amplifier (DA) asshown in FIG. 9B. One input of DA is connected to the end of this middledummy WL and the other input is connected to Vread that is generatedfrom Vref generator. This Vref generator circuit can generate variedReference voltages such as Vpgm, Vpass, Vread, and VRn for respectivehighest WL voltages in respective Program, Program-Verify and Readoperations. The highest WL voltage would take longest precharge time.Thus, once the highest WL voltage Vread being detected in dummy WL, thenit is meant all other selected 64 WLs, 1 SSL, and 1 GSL have been wellprecharged at the desired voltage levels.

For this m-page SLC Program-Verify operation, the highest WL voltage isVread on 63 unselected WLs per one of m selected Blocks. Thereby aVread-ΔV is switched to connect to one end of above said Vread WLDetector. Upon the detection of a full-precharged Vread at dummy WL,Vread-ΔV, then DA's output will issue a signal to one correspondinglyselected SLC page address of one of the selected Block to latch thewell-precharged voltages of 64 WLs, 1 SSL and 1 GSL lines concurrentlyon those parasitic WL capacitors with extra 100 ns-500 ns margin delayto final WL precharged WL to reach Vread when ΔV is set to be 0.5V. Thedetailed B′-state Program-Verify bias conditions are shown below inTable 3.

TABLE 3 one of 64WLs + 1SSL + 1GSL for m-page concurrent B′-stateProgram-Verify voltage setting (HXD = Vread + Vt) and latching (HXD = 0V) 1 WL(sel) = Vtb′min 1 XT(sel) = Vtb′min 63 WLs(un-sel) = Vread 63XT(un-sel) = Vread 1 SSL = Vdd 1 SSLp = Vdd 1 GSL = Vread 1 GSLp = VreadHXD with matched Pi, Qj, Sk Vread + Vt when setting but to enable localpump and latch Vss when latching

Upon the latching moment, a novel self-timed LBL discharge operation isimmediately initiated as indicated at Step 380 in accordance with theLBL Discharge Detector circuit and detailed subsequent steps would beexplained later.

Step 380: This step is another self-timed operation for the preferredm-page SLC or MSB Program-Verify to perform Vinh discharging andretaining operations in accordance with one of m selected 8 KBCACHEcel's Program-Verify bit pattern.

For saving Program-Verify time and WL Vread stress, both 4 KB Even and 4KB Odd C_(LBL) capacitors and cells are selected for cell stateverification in each 8 KB CACHEcel register. It is like single-page ABLRead in prior art. But this invention preferably performs m All-BLProgram-Verify discharging and retaining simultaneously.

As indicated in Step 380, those E-state cells C_(LBL)=0V (discharging)but those B′-state cells C_(LBL)=Vinh (retaining) In order to controlC_(LBL) discharge time automatically and accurately, one VLBL Detectorusing one metal0 LBLps power line as a sense line per Segment is builtin. It is like a conventional CAM's sense line but without taking extraarray layout overhead and precharge power consumption. Therefore, thisC_(LBL) discharging and retaining operation is another preferredself-timed step of the present invention.

The reason to use each LBLps line as 8 KB C_(LBL) discharging sense lineis because lastly precharged Vinh Driver is still retained in LBLpsline. Thereby no need of sense line precharged step is needed for powersaving. Once all 8 KB C_(LBL) capacitors are discharged, the higherlumped discharged current will pull down each common LBLps line frominitial Vinh. At one extreme and rare case, only 1 or 2 E-cells in eachselected WL, then the discharge current becomes very small and mightaffect the discharge time. In this case, one maximum allowedProgram-Verify time is also built in to ensure the Program-Verify timeis within the predetermined delay such as less than 5 μs. The finalvalue of each C_(LBL) bit voltage is determined by each NAND cells'state. If NAND cells are in E-state, then the corresponding C_(LBL)'sVinh will be discharged to Vss, otherwise C_(LBL) voltage would retainthe initial Vinh if those NAND cells are B′-state with Vt≧Vtb′min.

Step 381 takes a shorter time to discharge m 8 KB C_(LBL) capacitorsfrom Vinh to Vss than prior-art NAND BL discharge step from 1V to Vssdue to same large serial-resistance of each NAND cell string plus ashorter but a lighter C_(LBL) capacitance. In the present invention,although C_(LBL) is precharged to Vinh voltage which is higher than 1.0Vused in prior art, each C_(LBL) discharge time is still much faster thanC_(BL) because the value of C_(LBL)=1/(L×J) C_(BL), where L stands for LSegments per one Group and total J Groups per one NAND plane. TheC_(LBL) is one local metal1 LBL capacitor of the present invention,while C_(BL) is one global metal1 BL capacitor of the prior art.

Note, although the first selected SLC or MSB Program-Verify is ½-page 4KB Even page but the discharge of B′-state evaluation is done in onefull physical WL that contains both 4 KB Even and 4 KB Odd pagesconcurrently because the WL sharing. In this manner, 2λ-SLCProgram-Verify speed can be achieved. And this is highly beneficial forthis m-page SLC and MSB Program-Verify step because one of major delayof iterative Program operation is each iterative Program-Verify time.The preferred bias conditions are listed below.

-   -   a) TIE= . . . =TIEL/2=DIVen=CSL=0V (e.g., L=4),    -   b) SEGe=SEGo=0V    -   c) PREe=PREo=0V,    -   d) LBLps=0 or don't care.

Step 381: This step is designed to discharge all HV in 64WLs+1SSL+1GSLlines once one iterative Program-Verify operation is finished. Since theLBL discharged time is a self-timed step, thus this discharge isinitiated by the VLBL Detector automatically but stopped by the dummy WLwhich detects voltage from Vread dropping to Vss or near Vss. Thedischarge of 64WLs+1SSL+1GSL is done by setting the selected Block's HXDnode at Vdd and the corresponding Local pump circuit is disabled. Thisstep takes time less than 1 μs.

Steps 382 and 383: This step is like Step 375 and Step 376 but thedifference is that the Program-Verify cells of the selected SLC or MSBWL to be sensed are located within each 8 KB CACHEcel, rather than inCACHEint. The details of steps and bias conditions can refer to Step 375and are omitted here for the description simplicity.

Step 383: This step is to transfer 4 KB Even SLC or MSB page data storedin 4 KB SA to 4 KB P/RB but with a reversed polarity because each SLCE-state cell's analog LBL voltage=0V but the digital logic data is “1”in definition. Conversely, each SLC B′-state cell's analog voltage isVinh but in a digital logic data, it is defined “0.” Thus, the readoutbit data from each SA to each P/RB has to be flipped before it is sentout to Flash Controller via 8 I/Os. The Di=0/1 in each P/RB bit if eachcorresponding bit Qi=1/0 in each SA in accordance with the preferred setof bias conditions below.

-   -   a) IDC=IDAB=EQ=0,    -   b) ENSB1=ENSB2=PGM=IDB=0V,    -   c) IDC=one-shot pulse of Vdd with T5=1=Vdd.

Step 384: This decision step is then to check if all the selected SLC orMSB pages passing Program-Verify operation with the selected WLvoltage=Vtb′min. If yes, then Segment-decoder latches or flags have tobe reset at Step 385. Thereafter, the flow moves to check if S-PAS=1 inthe selected blocks within one selected Segment. If S-PAS=1, then eachBlock-dec's latch or flag is reset at Step 387. The flow then continuesmoving to Step 388 decision step.

Step 388: This decision step is then to check if all the selected SLC orMSB flags have been reset? If yes, then it is meant that the m-page SLCor MLC MSB Program-Verify is completed at Step 389. If not, then theflow continues to finish the remaining Even page for iterative SLC orMSB Program and Program-Verify operations. Thus the flow moves to Step390 and 391. For next iterative operation, the old page data stored in 4KB P/RB would be copied back to the designated CACHEcel and CACHEint fortemporary storage for subsequent SLC and MSB iterative operations.

Steps of 390 and 391: Before transferring 4 KB Even SLC or MSB bit datafrom 4 KB P/RB to 4 KB CACHEcel and CACHEint, both requiring a Vinhvoltage precharging first and then transferring and Vdd/Vss to Vinh/Vssconversion will be done accordingly. Next, the flow moves to Step 400.

FIG. 6C is a flow chart showing a method of performing m-page SLC/MLC(MSB Odd page) Program-Verify operation according to an embodiment ofthe present invention. As seen the method including Step 400 in whichone of m pages old 4 KB SLC or MLC MSB Odd analog data latched in one of4 KB Odd CACHEint at Step 380 (see FIG. 6B) would be sequentially sensedand amplified by both 4 KB corresponding Multipliers and 4 KB SAs toperform the 2-step analog amplifications as explained before. The final4 KB fully amplified Odd digital data would be stored in 4 KB SAs withthe following preferred bias conditions:

-   -   a) DIVen=SEGo (in CACHEcel)=H1,    -   b) TIE1˜TIEL/2=CSL=SEGe=0V and SEGe (in CACHEcel)=0V        -   (L=4),    -   c) Voutp (high/low)=Vref+/−ΔV,    -   d) T5=reverse one-shot pulse of Vdd.

Step 401: This step is to transfer each 4 KB restored original SLC orMSB Odd page bit data to 4 KB P/RB from 4 KB SA in 1-cycle but with thereversed logic as explained before in accordance with the following biasconditions in accordance with 4 KB DB circuit shown in FIG. 3.

-   -   a) IDAB=SENB1=SENB2=EQ=0V    -   b) IDC=WBK=one shot    -   c) T5=1 is used to enable SA.

After this step, then the previous old 4 KB SLC or MSB Odd page data arerestored in 4 KB P/RB for next comparing against with the newlyProgram-Verify Odd MSB page data retrieved from the selected 4 KB OddNAND cells in the selected WLs.

Steps of 401 and 402: These two steps are to perform the same sensingand amplification as above Step 400 and Step 401 of this flow of 4 KBOdd NAND cells from one of the 4 KB Odd CACHEcel register. The final 4KB fully amplified newly Program-Verified Odd digital data would bestored in 4 KB SAs with the similar bias conditions but omitted herein.

Step 403: With last 4 KB old Odd SLC or MSB page data stored in 4 KBP/RBs and 4 KB new readout of the Program-Verify data stored in 4 KBSAs, then the bit-flipping for Odd SLC and MSB page data will be done on4 KB P/RB on 1-bit by 1-bit flipping basis. Each P/RB Di-bit=0 would beflipped to Di-bit=1 by each corresponding Qi-bit=1 in SA. But each P/RBDi-bit=1 would not be flipped and will remain unchanged, regardless ofvalue each corresponding Qi-bit in SA in accordance with the biasconditions shown below.

-   -   a) IDC=IDAB=EQ=0V,    -   b) SENB1=SENB2=EQ=0V,    -   c) IDC=one shot pulse,    -   d) T5=1 is used to enable SA.

After this step, then one iterative Program-Verify step for one 4 KB OddSLC or MSB page Program is finished.

Step 404: This decision step is then to check if all the selected SLC orMSB Odd pages' data passing Program-Verify operation in accordance with4 KB loaded Odd page data under the selected WL voltage=Vtb′min. If yes,then Segment-decoder latches or flags have to be reset at Step 405.Thereafter, the flow moves to another decision Step of 406 to check ifS-PAS=1 in the selected blocks within one selected Segment? If S-PAS=1,then each Block-decoder latch or flag is reset at Step 407. The flowthen continues moving to Step 408 decision step to check if all latchesof Block-decoders being reset? If yes, it means all selected Odd pagespassing Program-Verify. That means both Even and Odd pages pass SLC andMSB Program-Verify, thus the flow ends at Step 409.

In another flow path of Step 404 is moving to Step 410 whenever oneselected 4 KB Odd page cells fails B′-state Program-Verify step, thenthe iterative Program-Verify on the remaining Odd pages have to becontinued.

Step 410 and Step 411: These two steps are to write back and lock theupdated Odd SLC or MSB page data back to two designated CACHEcel andCACHEint capacitors for next iterative Program-Verify operation. Again,the reason to write back two CACHE, instead of one is because the verifystep always needs one CACHEint to keep the last updated Odd page data,and CACHEcel uses the P/RB data as a new or next iterative Program andProgram-Inhibit page pattern. The copy back to two CACHEcel and CACHEintcan be done simultaneously as did in previous steps of 353 and 354. Thedetails are skipped here.

Step 412: This decision step is used to check if the last 4 KB Odd pagedata passing the Program-Verify. If No, then the flow moves to Step 365which continues the remaining Odd-page SLC and MSB program by setting upand latching the desired Vpgm, Vpass, H1 on the selected sets of64WLs+1SSL+1GSL. If Yes, then the flow moves to Step 413 which is usedto check if all m Odd pages (or Nmax page) data passing Program-Verifystep.

Step 413: Once any single Odd page finishing one iterative Program andProgram-Verify step, then the counter number N is increased to N+1, andthe new N value will be compared against the predetermined Nmaxiterative steps.

Step 414: If the number of the iterative steps is still less than thepredetermined Nmax, then the next iterative Program and Program-Verifystep would be continued by branching out to Step 369.

Step 415: If the number of the iterative steps reaches the Nmax, thenthe next iterative Program and Program-Verify step would be stopped andthe failed pages would be reported as BAD pages.

Note, in one of the approach of the present invention is to programanother erased pages whenever the failed program pages are reported. Inthis manner, then every time new m-page SLC and MLC-MSB Program isperformed, it can always be achieved on total m exact pages withoutmissing any page, thus the power and time consumption and reloading thepage data of those failed pages can be avoided. The NAND chip performingthe m-page Program and Program-Verify has to record those final physicalgood and bad WL addresses to allow Flash Controller to update theirstatus in log file.

FIG. 7A is a diagram showing circuit structure and method for performingMLC (LSB page) m-page data loading and B′-adjustment according to anembodiment of the present invention. as shown, this flow shows thepreferred detailed steps of m pages of LSB page data sequential loadingfrom 8 external I/Os and the B′-state page data adjustment in accordancewith LSB bit data, MSB bit data and the preferred 4 Vt assignments shownin FIG. 4 for m-page MLC LSB Program and Program-Inhibit operations ofthe present invention. The B′-adjustment is meant to differentiateB1′-state cells with Vtb1′<Vtbmin from B2′-state cell with Vtb2′≧Vtbmin.The B1′-state only exists in MLC MSB bit, but eventually would beprogrammed into B2′-state, e.g., B-state when LSB=1 or C-state whenLSB=0 with MSB=0.

In an embodiment, performing B′-state data bit adjustment is mainly fordoing the right MLC-LSB page Program in accordance with the preferred2-Vt assignments of E-state and B′-state for a MSB bit and the preferred4-Vt assignment of E, A, B, and C states for both MSB and LSB bits asset in FIG. 4. As such, the newly I/O loaded raw LSB page data cannot bedirectly used for MLC LSB page Program, thus each LSB page data needs tobe readjusted. On the contrary, each 8 KB externally I/O loaded rawMLC-MSB page data can be used directly for 8 KB MSB Program because MSBbit Program is always performed before each LSB bit Program on same MLCcell. In other words, MSB bit Program does not care about each LSB bitdata in each MLC 4-Vt assignment.

For example, each MSB's B′-state is split into B1′-state and B2′-state(or B-state) under respective verification conditions of Vtb′min andVtbmin. The value of temporary Vtb′min is preferably set to beVtb′min<Vtbmin for achieving tighter Vt distributions for 4 MLC statesof E, A, B, and C over the lengthy duration of 4-Vt MLC Program.

From one of preferred 4-Vt logic assignment, E=11, A=01, B=10 and C=00(The left bit is LSB bit but right bit is MSB bit). As such, If each rawLSB bit=1, it is meant to keep the Vts of E-cell or B-cell unchanged.Conversely, when each raw LSB bit=0, it is meant to program eitherE-cell to A-cell or B-cell to C-cell. But in MLC MSB bit Programoperation, an extra B1′ below B2′ or B-state but above E-state iscreated for a tighter 4-Vt MLC Program control over time. Under thisscenario, B1′-cell is not accounted as B-cell from each raw LSB bitperspective. For LSB=1, B-cell is not chosen for Program. By contrast,B′-cell has to be programmed even LSB=1. That explains why in thism-page MLC 4-Vt Program scheme, each raw LSB bit data has to bereadjusted to assign “0” to B1′-cell but “1” for B2′-cell or B-cell. Inorder to do so, each LSB value of “1” for B-cell has to be flipped to be“0” for B1′-cell, which is differentiated from B-cell. In addition,logically, each corresponding MSB bit data to each LSB bit is requiredto do the right bit flipping.

In conclusion, the discrepancy between B1′ and B states makes the newlyI/O loaded LSB raw bit data cannot be used directly for LSB page Programof this MLC Program application. A solution of a preferred MLC (LSBpage) B′ loading and adjustment methodology comprising of 9 consecutivesub-steps with 5 basic functions is disclosed below. It starts from EvenLSB page first and then followed by Odd LSB page, or vice versa.

Referring to FIG. 7A, the method includes precharge steps indicated byArrow1 and Arrow8. All C_(LBL) capacitors in the selected CACHEs areinitially precharged with Vinh voltage for the reasons of a superProgram-Inhibit scheme during Program or an enhanced charge-sharingscheme during Read or Verify operations. The method further includesdischarge steps indicated by Arrow2 and Arrow3. The Vinh voltages ofC_(LBL) capacitors in the selected CACHEs are either discharged to Vssor retained in accordance with the loaded MSB and LSB page data duringB′-adjustment and Program operation. Furthermore, the method includesload and latch (LD/LT) steps in which page data loading and latching areperformed and Vdd/Vss is converted to Vinh/Vss f indicated by Arrows of4, 6 and 9.

In a specific embodiment of the present invention, there are severalmajor data to be temporarily latched in each corresponding 8 KB pseudoCACHE C_(LBL) capacitors, including 1) precharged page data, 2)externally loaded Program page data, 3) internal readout page data, 4)write back page data, and 5) duplicate page data. The precharged pagedata represented by Vinh precharged voltage are locked in the selected 8KB pseudo CACHEs. The externally loaded Program page data include MLC'sMSB or LSB or SLC page data load from 8 I/Os. The internal readout pagedata include those read from E, A, B1′, B and C-state cells. The writeback page data are from SA or P/RB into pseudo CACHE which is updatedfor bit-flipping.

Moreover, the method includes charge-sharing steps indicated by Arrow5and Arrow7 of FIG. 7A. This is the readout process that transfers thesensed cell analog data to be amplified by Multiplier because the signalmagnitude has been reduced due to LBL and GBL charge-sharing dilutedscheme.

The whole proposed methodology involves some key control signals in eachselected Blocks, Segments, Groups, and Planes such as:

-   -   a) SEGo and SEGe: The paired Segment control signals that        connect or disconnect one paired 4 KB Odd and 4 KB Even metal1        LBLs from one shared 4 KB metal2 GBLs.    -   b) PREo and PREe: The paired Segment power control signals that        connect or disconnect selectively one paired 4 KB Odd and 4 KB        Even metal1 4 KB metal1 LBLs and only common Segment power        supply line of LBLps.    -   c) LBLps: This is one of Segment Vinh power supply line to be        shared by one paired 8 KB CACHE registers physically on top and        bottom sharing LBLps in cell array layout. If top and bottom        C_(LBL) capacitors are turned on simultaneously, then one LBLps        line can precharge or discharge them concurrently.    -   d) TIE: This is a special bridge NMOS transistor inserted        in-between one paired of two physically adjacent Segment C_(LBL)        capacitors or lines. There is advantageous to have this TIE        transistor and control signal during each C_(LBL) precharge and        discharge selected NAND cell in each CACHEcel. For example, each        readout cell's analog data in CACHEcel can be easily shared with        the next paired adjacent C_(LBL) capacitor in CACHEint so that        easier data manipulation can be executed in SLC and MLC        operations.

Referring to FIG. 7A, the LSB methodology flow starts in steps from 1 to9 (including those for Even and Odd pages) to sequentially receive,load, and latch m pages of 8 KB data. Each 8 KB page is further dividedinto 4 KB page data and 4 KB Odd page data loaded from eight common I/Osin unit of byte of a typical parallel NAND flash or in unit of 4 I/Os ofan 8-pin SPI Serial NAND flash configuration.

In addition, m Addresses are loaded into the respectively designatedAddress, Row and Column Buffers to set the corresponding flags on mselected Even and Odd Segment-decoder's latches shown in FIG. 2B and mselected Block-decoder's latches shown in FIG. 2A of the preferredHiNAND2 array shown in FIG. 1A, while the Command is loaded into on-chipCommand Buffer (not shown) and m pages of external I/O data are loadedinto the corresponding on-chip 4 KB real CACHE register. But Internalreadout page data can be directly loaded on-chip 4 KB real CACHEregister from SA or P/RB to external 8 I/Os, depending upon which areavailable at the moment of operation.

Note, total four pseudo CACHEs of 8 KB CACHEcel, 8 KB CACHEint, 8 KBCACHE1sb, and 8 KB CACHEmsb are involved for properly operating this MLC(LSB Page) m-page data loading and B′-state adjustment. In FIG. 7A, onlyone partial cell array of Group 1 nears DB is shown, while other J-1Groups are not shown due to paper space limitation. As the four kinds ofCACHE definitions, the CACHE that contains one selected physical WL (orpage) comprising 8 KB NAND cells is termed as CACHEcel. The rest of 3CACHEs are arbitrarily termed for easier explanation purpose only. Forexample, if a newly selected WL is shifted to CACHEint, then CACHEint isre-termed as CACHEcel and CACHEcel would be re-termed as CACHEintinstead. Each Group of HiNAND2 array has L Segments, where L≧4.Moreover, each CACHE register is comprised of a plurality of NAND Blocksthat share one local LBL or C_(LBL) capacitor per column.

There are four symbols representing different desired biased voltagessuch as “0”=Vss, V1=Vdd, H1≧Vinh+Vt, V2=Vinh where 5V≦Vinh≦10V. Now, thedetail operations are disclosed below.

Arrow 1 or #1 is a first step of this methodology. As #1 indicates both4 KB even CLBL1e and 4 KB CLBL1o capacitors of CACHEcel, CACHEint,CACHE1sb, and CACHEmsb are selected for Vinh precharging initially. Inorder to do so, the following bias conditions are set up in accordancewith the circuits of HiNAND2 array and its associated Block-decoder andSegment-decoder:

-   -   a) LBLps1=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB even/odd CACHEcel        C_(LBL) capacitors.    -   b) LBLps2=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB even/odd CACHEint        C_(LBL) capacitors.    -   c) LBLps3=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB even/odd CACHE1sb        C_(LBL) capacitors.    -   d) LBLps4=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB even/odd CACHE1sb        C_(LBL) capacitors.    -   e) TIE1=0V is to shut off MLBLp to disconnect CACHEcel from        CACHEint so that CACHEcel and CACHEint can perform independent        Vinh precharge. In fact, TIE can be any voltage because        CACHEcel=CACHEint=Vinh.    -   f) TIE2=0V is to shut off MLBLp to disconnect CACHEcel from        CACHEint so that CACHE1sb and CACHEmsb can perform independent        Vinh precharge. In fact, TIE can be any voltage because        CACHEcel=CACHEint=Vinh.    -   g) SEGo1=SEGe1=0V are to shut off MLBLpo and MLBLpe to prevent        leakage of CACHEcel to 4 KB shared GBLs.    -   h) SEGo2=SEGe2=0V are to shut off MLBLpo and MLBLpe to prevent        leakage of CACHEint to 4 KB shared GBLs.    -   i) SEGo3=SEGe3=0V are to shut off MLBLpo and MLBLpe to prevent        leakage of CACHE1sb to 4 KB shared GBLs.    -   j) SEGo4=SEGe4=0V are to shut off MLBLpo and MLBLpe to prevent        leakage of CACHEmsb to 4 KB shared GBLs.

Four Vinh current flows are supplied by four corresponding LBLps lines(with Drivers not shown). The Vinh precharge can be done in 1-cycle onall 4 Even/Odd CACHEs. The Vinh value can be flexibly adjusted down toVdd if the location of CACHE nears Data Register 700 for less impact ofcharge-sharing.

Arrow 2 or #2 is a second step of this methodology and is performedduring the Program-Verify that will result in either discharging orretaining C_(LBL)'s Vinh, depending on the cells' states of E or B′ (MSBbit program state) because Vtamin is applied to m selected WLs. This issame as to read the MSB programmed cell done in MSB bit Program flow.Note, since Vtamin is applied to m selected WLs, thus the discharge ofC_(LBL)'s Vinh will happen on m 8 KB cells in m full WLs that includeall 4 KB Even and 4 KB Odd cells.

Since each selected WL=Vtamin and is located in each CACHEcel in m orm/4 sets only as explained above, thus Vinh discharging and retainingcan happen only on those E-cells because Vte<Vtamin of the selected WLin CACHEcel C_(LBL) capacitors. Those programmed B′-cells or MSB cellswill retain Vinh accordingly because Vtb′min>Vtamin where Vtb′min is theProgram-Verify voltage for both SLC and MSB bits.

The first #2 arrow shows the discharge path is from both 4 KB CLBL1e and4 KB CLBL1o capacitors into the selected Block 1 cells within CACHEcel.In fact, the discharge path is through the 64-WL NAND cell string, 1 MSand 1 MG String-select transistors. The high string series resistance issame for both the present invention and prior art. But capacitance ofCLBL1e and CLBL1o in the present invention is only 1/(L×J) of prior artNAND's BL. As a result, the Vinh LBL discharge of the present inventionis much faster than 1V BL discharge of prior art NAND.

The second #2 arrow shows the discharge path from each bit of CACHEintto each corresponding bit of CACHEcel via each MLBLb transistor with itsgate tied to TIE1. The bias conditions of #2 are explained below:

-   -   a) TIE1=H1 (or Vdd)        -   This is to turn on MLBLb to connect a path between CACHEcel            and CACHEint only so that current flow if any can happen on            these two CACHEs only. Since the selected WL's cells are in            CACHEcel, thus those E-cells in CACHEcel would discharge            those E-cells' corresponding C_(LBL) from Vinh to Vss. But            due to the connection between each CACHEcel and each            CACHEint C_(LBL), the discharge of CACHEcel would also            discharge CACHEint C_(LBL) as well with a double time. Thus,            finally, CACHEcel=CACHEint=Vinh if B′ cells in CACHEcel.            Conversely, C_(LBL)s of CACHEcel=CACHEint=Vss if E cells in            CACHEcel. This self-timed step is done under selected            WL=Vtamin by a V_(LBLps1) Detector.    -   b) TIEL/2=0V. (L=4)        -   It is because no need of the desired Vinh discharge            happening in both CACHE1sb and CACHEmsb.    -   c) Others=0V are to shut off any leakage paths to ensure Vinh        discharge only happening between CACHEcel and CACHEint not        affected.

In conclusion, after this step, both CACHEcel=CACHEint=MSB=Vinh/Vss bitdata but not being saved yet.

Arrow 3 (#3): This is a third step to perform Vinh/Vss latching inaccordance with 8 KB B′/E (MSB) state cells in one selected physical WLin CACHEcel Segment. Once Vinh discharging is completed for thoseE-cells, then TIE signal switches from H1 or Vdd to Vss to disconnectCACHEcel from CACHEint so that final identical Vinh/Vss analog patternsaccording to 8 KB cells B′/E-states can be locked in CACHEcel andCACHEint respectively and safely for subsequent bit-flippingProgram-Verify operation. In conclusion, after this step, the analogVinh/Vss voltage patterns of 8 KB C_(LBL) of selectedCACHEcel=CACHEint=MSB=Vinh or Vss, are latched temporarily. Thepreferred bias conditions include setting all signals to 0V includingall TIEL/2=0V (e.g., L=4 as seen in FIG. 7A).

Arrow4e or #4e is a first part of a fourth step of this methodology. The#4e step is to apply V1=Vdd to SEGe3 signal to connect each 4 KB GBLs tothe corresponding 4 KB even C_(LBL) capacitors in ½ of 4 KB CACHE1sbregister only via the corresponding 4 KB Even MLBLpe transistors inconduction state. The remaining ½ of Odd 4 KB CACHE1sb are isolated from4 KB GBLs at this step to prevent GBL loading into non-selected pseudoCACHEs.

The #4e step is to perform 4 KB Even page data loading, latching, andVdd/Vss to Vinh/Vss conversion in LBL lines within CACHE1sb when 4 KBGBLs are loaded with the external 4 KB Even LSB raw digital page data.If LSB bit pattern=1=Vdd that makes the drain voltage of MLBLpe=Vdd,then when gate of MLBLpe=SEGe3=Vdd, the source voltage of MLBLpe=Vinhwould be retained. Conversely, If LSB bit pattern=0, that makes thedrain voltage of MLBLpe=Vss, then when gate of MBLpe=SEGe3=Vdd, thesource voltage of MLBLpe=Vinh would be discharged to Vss as well. In anembodiment, Vdd/Vss to Vinh/Vss conversion in LBL lines is executed forany data loaded into CACHE register for super Program-Inhibit scheme.The preferred bias conditions include:

-   -   a) SEGe3=V1, SEGo3=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded into the selected 4 KB Even        CACHE1sb CLBL capacitors, not other CACHEs.

Arrow 4o (#4o) is a second part of the fourth step of this methodology.Similarly, the #4o step is to apply V1=Vdd to SEGo3 signal to connecteach 4 KB GBL to the corresponding 4 KB Odd C_(LBL) capacitors in ½ of 8KB CACHE1sb register only via the corresponding 4 KB Odd MLBLpetransistors. The remaining ½ of 4 KB Even CACHE1sb are isolated from 4KB GBLs at this step. The preferred bias conditions are listed below:

-   -   a) SEGo3=V1=Vdd, SEGe3=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded and latched into the        selected 4 KB Odd CACHE1sb C_(LBL) capacitors, not other Odd        CACHEs.

In conclusion, after #4e and #4o steps, whole 8 KB LSB data are locked(saved) in 8 KB CACHE1sb C_(LBL) capacitors for the subsequent Programand Program-Verify operations.

Arrow5e or #5e is a first part of a fifth step of this methodology,performing the analog sensing and latching by a 4 KB Multiplier and 4 KBSA from the selected 4 KB Even cells within 4 KB Even CACHEint C_(LBL)capacitors. Therefore, H1 voltage is applied to SEGe2 signal to connecteach 4 KB GBL to each corresponding 4 KB Even C_(LBL) capacitors in ½ of8 KB CACHE1sb register only via the corresponding 4 KB Even MLBLpetransistors. The remaining ½ of 4 KB Odd CACHE1sb are isolated from 4 KBGBLs at this step. The preferred bias conditions are listed below:

-   -   a) SEGe2=H1, SEGo2=Vss,    -   b) All other signals=0V to allow the 4 KB Even digital page data        in 4 KB GBL bus lines to be fully loaded into the selected 4 KB        Even CACHE1sb C_(LBL) capacitors, not other CACHEs.

Arrow5o or #5o is a second part of the fifth step of this methodology.Similarly, the #5o step is to perform the analog sensing and latching bya 4 KB Multiplier and 4 KB SA from the selected 4 KB Odd cells within 4KB Odd CACHEint C_(LBL) capacitors. Therefore, H1 voltage is applied toSEGo2 signal to connect each 4 KB GBL to each corresponding 4 KB OddC_(LBL) capacitors in ½ of 8 KB CACHE1sb register only via thecorresponding 4 KB Odd MLBLpe transistors. The remaining ½ of 4 KB EvenCACHE1sb are isolated from 4 KB GBLs at this step. The preferred biasconditions are:

-   -   a) SEGo2=H1, SEGe2=Vss,    -   b) All other signals=0V to allow the 4 KB Odd digital page data        in 4 KB GBL bus lines to be fully loaded into the selected 4 KB        Odd CACHE1sb C_(LBL) capacitors, not other CACHEs.

Arrow6e or #6e is a first part of a sixth step of this methodology. The#6e step is like #4e step to apply V1=Vdd to SEGe4 signal to connecteach 4 KB GBLs to the corresponding 4 KB even C_(LBL) capacitors in ½ of8 KB CACHEmsb registers only via the corresponding 4 KB Even MLBLpetransistors. The remaining ½ of Odd 4 KB CACHE1sb are isolated from 4 KBGBLs at this step.

The #6e step is to perform 4 KB Even data loading, latching and Vdd/Vssto Vinh/Vss conversion in LBLs when 4 KB GBLs are loaded with theexternal 4 KB Even LSB page data. If MSB bit pattern=1 that makes thedrain voltage of MLBLpe=Vdd, then when gate of MLBLpe=SEGe4=Vdd, thesource voltage of MLBLpe=Vinh would be retained.

Conversely, If MSB bit pattern=0 that makes the drain voltage ofMLBLpe=Vss, then when gate of MLBLpe=SEGe4=Vdd, the source voltage ofMLBLpe=Vinh would be discharged to Vss as well. This is a novel step ofVdd/Vss to Vinh/Vss conversion in LBLs for any data loaded into CACHEfor super Program-Inhibit scheme. The preferred bias conditions are:

-   -   a) SEGe3=V1, SEGo3=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded into the selected 4 KB Even        CACHE1sb C_(LBL) capacitors, not other CACHEs.

Arrow6o or #6o is a second part of the sixth step of this methodology.Similarly, the #6o step is to apply V1=Vdd to SEGo4 signal to connecteach 4 KB GBL to the corresponding 4 KB Odd C_(LBL) capacitors in ½ of 8KB CACHEmsb register only via the corresponding 4 KB Odd MLBLpetransistors. The remaining ½ of 4 KB Even CACHEmsb are isolated from 4KB GBLs at this step. The preferred bias conditions are:

-   -   a) SEGo4=V1, SEGe4=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded into the selected 4 KB Odd        CACHEmsb C_(LBL) capacitors, not other CACHEs.

After #6e and #6o steps, whole 8 KB MSB data are locked (saved) in 8 KBCACHEmsb C_(LBL) capacitors for the subsequent Program andProgram-Verify operations.

Arrow7e or #7e is a first part of a seventh step of this methodology.The #7e step is to perform the analog sensing and latching by 4 KBMultiplier and 4 KB SA from the selected 4 KB Even cells within 4 KBEven CACHEcel C_(LBL) capacitors. Therefore, H1 is applied to SEGe1signal to connect each 4 KB GBL to each corresponding 4 KB Even C_(LBL)capacitors in ½ of 8 KB CACHEcel register only via the corresponding 4KB Even MLBLpe transistors. The remaining ½ of 4 KB Odd CACHEcel areisolated from 4 KB GBLs at this step. The preferred bias conditionsinclude:

-   -   a) SEGe1=H1, SEGo1=Vss,    -   b) All other signals=0V to allow the 4 KB Even digital page data        in 4 KB GBL bus lines to be dedicated for 4 KB Even CACHEcel        sensing and amplification, not affected by other CACHEs.

Arrow7o or #7o is a second part of the seventh step of this methodology.Similarly, the #7o step is to perform the analog sensing and latching by4 KB Multiplier and 4 KB SA from the selected 4 KB Odd cells within 4 KBOdd CACHEcel C_(LBL) capacitors. Therefore, H1 is applied to SEGo1signal to connect each 4 KB GBL to each corresponding 4 KB Odd C_(LBL)capacitors in ½ of 8 KB CACHE1sb register only via the corresponding 4KB Odd MLBLpe transistors. The remaining ½ of 4 KB Even CACHE1sb areisolated from 4 KB GBLs at this step. The preferred bias conditions are:

-   -   a) SEGo1=H1, SEGe2=Vss,    -   b) All other signals=0V to allow only 4 KB Odd CACHEcel to be        sensed and amplified by 4 KB Multipliers and 4 KB SAs.

Arrow8e or #8e and Arrow8o or #8o are respective a first and a secondpart of an eighth step of this methodology indicating that both 4 KBeven CLBL1e and 4 KB odd CLBL1o capacitors of CACHEcel are selected forVinh precharging initially with the preferred bias conditions below.

-   -   a) LBLps1=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB even/odd CACHEcel LBL        capacitors.    -   b) TIE1=0V is to shut off MLBLp to disconnect CACHEcel from        CACHEint so that CACHEcel and CACHEint can perform independent        Vinh precharge. In fact, TIE can be any voltage because        CACHEcel=CACHEint=Vinh. TIE1˜TIEL/2=0V.    -   c) SEGo1=SEGe1=0V are to shut off MLBLpo and MLBLpe to prevent        leakage of CACHEcel to 4 KB shared GBLs.

Arrow9e or #9e is a first part of ninth step of this methodology. The#9e step is to apply V1 to SEGe1 signal to connect each 4 KB GBLs to thecorresponding 4 KB even C_(LBL) capacitors in ½ of 8 KB CACHEcelregister only via the corresponding 4 KB Even MLBLpe transistors. Theremaining ½ of Odd 4 KB CACHEcel are isolated from 4 KB GBLs at thisstep. The #9e step is further to perform 4 KB Even data latching andVdd/Vss to Vinh/Vss conversion in LBLs in accordance with the 4 KB Evenpage data in CACHEcel. The preferred bias conditions are:

-   -   a) SEGe1=V1, SEGo1=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded into the selected 4 KB Even        CACHEcel C_(LBL) capacitors, not other CACHEs.

Arrow90 or #90 is a second part of the ninth step of this methodology.Similarly, the #90 step is to apply V1 to SEGo1 signal to connect each 4KB GBL to the corresponding 4 KB Odd C_(LBL) capacitors in ½ of 8 KBCACHEcel register only via the corresponding 4 KB Odd MLBLpetransistors. The remaining ½ of 4 KB Even CACHE1sb are isolated from 4KB GBLs at this step. The preferred bias conditions are:

-   -   a) SEGo1=V1, SEGe1=Vss,    -   b) All other signals=0V to allow the 4 KB digital page data in 4        KB GBL bus lines to be fully loaded into the selected 4 KB Odd        CACHEcel C_(LBL) capacitors, not other CACHEs.

After #9e and #90 steps, whole 8 KB LSB data is latched (saved) in 8 KBCACHEcel C_(LBL) capacitors for the subsequent Bit-flipping of Programand Program-Verify operations.

FIG. 7B is a diagram showing circuit structure and method for performingMLC (LSB page) multi-page A-state Program-Verify operation according toan embodiment of the present invention. As shown, this is a first flowof a preferred m-page LSB Even A-cell Program-Verify operation toprogram the selected E-state cells to A-state cells by using a Vtamin asa Program-Verify WL voltage in all selected m pages when the externallyinputted bit data of LSB=0 and MSB=1. This methodology is used furtheralong with flow charts shown in FIG. 7H and FIG. 7I to explain thepreferred m-page MLC (LSB page) A-state Program-Verify operation. Asshown, the A-state Program-Verify methodology is based on a scenariothat m pages of MLC's MSB bits have been programmed successfully in allselected MLC cells in m selected MLC-WLs with a condition of Flag cellbit=0 in each selected MLC-WL. In order to provide more reliable andfaster flow with less power-consumption for m-page MLC 4-VtProgram-Verify operation, this preferred methodology is designed tocheck 4 Vts from the lowest program Vt of A-state by Vtamin, to nexthigher program Vt of B-state by Vtbmin, and last to the highest programVt of C-state by Vtcmin. Note, a 4-Vt MLC cell contains one final erasedE-state (11) but 3 final program states of A (10), B (01) and C (00).

In an embodiment, this methodology is designed based on a fixed rulethat each MSB bit Program has to be performed before each LSB bitProgram. But Verify operation is preferred to be not limited to suchsequence of starting from A-state, next B-state, and finally C-state.When m random pages are selected for a m-page LSB Program-Verifyoperation, then it is preferred to be performed on 4 KB-by-4 KB basisdue to the limit of 4 KB GBL bus lines and 4 KB Multiplier, SA, and P/RBwith area saving purpose.

Furthermore, for m random-page Program-Verify, each random page ispreferably initiated and completed by each self-timed charging anddischarging control of each WL by a corresponding Vread Detector and of8 KB C_(LBL) capacitors by each corresponding VLBL Detector. Conversely,for m non-random-page Program-Verify, then m non-random pages can bepreferably initiated and completed at same timelines by using only oneself-timed Vread Detector and one VLBL Detector.

In an embodiment, it is desired to have m-page Program, Program-Verify,and Read operations for m mixed random/non-random pages in one singleHiNAND2 chip design. For this preferred m-page MLC (LSB page) A-stateProgram-Verify operation, as shown in FIG. 7B, it comprises 9 stepsindicated by arrows named as 1, 2, 3e, 3o, 4e, 4o, 5e, 6e, 6o, 7e, 7o,8e, 8o, 9e and 9o with several similar basic operations as explainedbefore in FIG. 7A. The terminology of new PGM-VFY is to representProgram-Verify below and CS to represent charge-sharing.

In a specific embodiment, the methodology for performing m-page MLC (LSBpage) A-state Program-Verify operation includes precharging one or morepseudo CACHE registers, discharging selective C_(LBL) capacitors inaccordance with the new readout MSB page data after A-state PGM-VFYoperation, loading and latching 8 KB A-state page data, andcharge-sharing among the selective C_(LBL) capacitors with correspondingGBLs. For the precharging step, indicated by Arrows of 1, 5 and 8, itneed at most two of CACHEcel and CACHEint registers to be selected forprecharging with Vinh voltage for A-state PGM-VFY. In the dischargingstep, indicated by Arrow 2, the selective C_(LBL) capacitors in theselected CACHEcel and CACHEint are either discharged to Vss or retainedwith the Vinh voltage in accordance with the new readout MSB page dataafter A-state Program-Verify operation. Additionally, in loading andlatching (LD/LT) step, indicated by Arrows of 6 and 9, there are 8 KBMSB page data and 8 KB A-state data to be loaded from 4 KB P/RB or 4 KBin 2-cycle SA and temporarily latched in each corresponding 8 KBCACHEcel and 8 KB CACHEmsb C_(LBL) capacitors. Furthermore, incharge-sharing step, indicated by Arrows of 3, 4 and 7, it is a readoutprocess that transfers each sensed cell analog data to be amplified byeach Multiplier because the signal magnitude has been reduced due toeach LBL and each GBL CS diluted scheme.

The whole proposed methodology involves some key control signals in eachselected Block, Segment, Group, and Plane such as:

-   -   a) SEGo and SEGe: The paired Segment control signals that        connect or disconnect one paired 4 KB Odd and 4 KB Even metal1        LBLs from one shared 4 KB metal2 GBLs.    -   b) PREo and PREe: The paired Segment power control signals that        connect or disconnect selectively one paired 4 KB Odd and 4 KB        Even metal1 4 KB metal1 LBLs and only common Segment power        supply line of LBLps.    -   c) LBLps: This is one of Segment Vinh power supply line to be        shared by one paired 8 KB CACHE registers physically on top and        bottom sharing LBLps in cell array layout. If top and bottom        C_(LBL) capacitors are turned on simultaneously, then one LBLps        line can precharge or discharge them concurrently.    -   d) TIE: This is a special bridge NMOS transistor inserted        in-between one paired of two physically adjacent Segment C_(LBL)        capacitors or lines. There is advantageous to have this TIE        transistor and control signal during each C_(LBL) precharge and        discharge selected NAND cell in each CACHEcel. For example, each        readout cell's analog data in CACHEcel can be easily shared with        the next paired adjacent C_(LBL) capacitor in CACHEint so that        easier data manipulation can be executed in SLC and MLC        operations.

Referring to FIG. 7B, the LSB methodology flow starts from step 1 to 9to sequentially receive, load, and latch m pages of 8 KB LSB data. Each8 KB LSB page is further divided into 4 KB Even LSB page data and 4 KBOdd LSB page data loaded from eight common I/Os in unit of byte of atypical Parallel NAND flash or in unit of 4 I/Os of an 8-pin SPI SerialNAND flash configuration.

Arrow1 or #1 is a first step of this methodology, indicating only 4 KBeven LBL1e and 4 KB LBL1o capacitors of CACHEcel are selected for Vinhprecharging initially. In order to do so, the following bias conditionsare set up in accordance with the circuits of HiNAND2 array and itsassociated Block and Segment decoders.

-   -   a) LBLps1=V2=Vinh and PREo1=PREe1=H1 are to turn on MLBLso and        MLBLse concurrently to precharge both 4 KB Even/Odd CACHEcel        C_(LBL) capacitors.    -   b) LBLps2=Vss,    -   c) TIE1=0V is to shut off MLBLp to disconnect CACHEcel from        CACHEint so that CACHEcel Vinh would not leak to CACHEint.    -   d) TIEL/2=0V (L=4 in this example) is to shut off another MLBLp        because no precharge will happen to CACHE1sb and CACHEmsb in        this step.    -   e) SEGo1=SEGe1=0V are to shut off 4 KB MLBLpo and 4 KB MLBLpe to        prevent leakage of 4 KB CACHEcel to 4 KB shared GBL lines.    -   f) All other SEGo2/3/4=SEGe2/3/4=0V are to shut off MLBLpo and        MLBLpe to prevent leakage of other non CACHEcel.

The Vinh current flow is supplied by one corresponding LBLps1 line witha Vinh Driver (not shown). The Vinh precharging step can be done in1-cycle on all 4 Even/Odd CACHEs. The Vinh value can be flexiblyadjusted down to Vdd if the location of CACHE nears DR 700 for lessimpact of charge-sharing.

Arrow 2 (#2) indicate a second step to perform 8 KB A-state PGM-VFY thatresults in either C_(LBL) voltage to Vss due to discharging or C_(LBL)be retained at Vinh due to no discharging, depending on cells' states inCACHEcel. For example, E-cells=Vss because Vtemax<Vtamin butA-cells=B-cells=C-cells=Vinh because no cell current conduction due toVtamin<Vtbmin<Vtcmin. In conclusion, after this step, the 8 KB MLCanalog Cell voltages, E-cells=Vss are differentiated fromA-cells=B-cells=C-cells=Vinh and are stored herein m 8 KB CACHEcel in mCACHEcels.

Arrow 3e (#3e): The #3e step is to directly load lastly stored 4 KB EvenLSB digital Program data in 4 KB P/RBs to 4 KB pseudo Even CACHEint thatis latched therein in 1-cycle.

As explained before, this step is substantially a digital (Vdd/Vss) toanalog (Vinh/Vss) conversion. Therefore, a Vinh precharged has been wellperformed before this LD/LT step at targeted 4 KB Even CACHEint asindicated in FIG. 7A methodology. Thus the Vinh precharge is skippedhere and LD/LT operation is directly performed in this #3e step alongwith the preferred bias conditions below:

-   -   a) SEGe2=H1, SEGo2=Vss,    -   b) All other signals=0V to ensure that only one set of 4 KB Even        CACHEe LBLs is exclusively connected to 4 KB shared GBLs for        LD/LT step.    -   c) 4 KB GBL voltages=Vdd/Vss of 4 KB P/RB digital patterns.        -   After this #3e, then the flow moves to #3o step below. Once            moving to #3o step, then the above 4 KB P/RB data would be            latched in 4 KB Even CACHEint in the form of Vinh/Vss.

Arrow 3o (#3o): The #3o step is to directly load lastly stored 4 KB OddLSB digital Program data in 4 KB P/RBs to 4 KB pseudo Odd CACHEint andthen latched therein in 1-cycle. After #3o step, then whole 8 KBCACHEint are filled with 8 KB LSB page data in 2 cycles loaded fromon-chip 4 KB real CACHE that are loaded via 8 I/Os in 8K sequentialcycles. The preferred bias conditions are opposite to #3e step:

-   -   a) SEGo2=H1, SEGe2=Vss,    -   b) All other signals=0V to ensure that only one set of 4 KB Odd        CACHEo LBLs is exclusively connected to 4 KB shared GBL lines        for LD/LT step.    -   c) 4 KB GBL voltages=Vdd/Vss of 4 KB P/RB digital patterns.

Arrow 4e (#4e): The #4e step is to perform Vinh/Vss analog sensing andamplification of lastly stored 4 KB MSB Even page data stored in 4 KBEven CACHEmsb by 4 KB Multipliers first and then 4 KB SAs. The preferredbias conditions are summarized below:

-   -   a) SEGe4=H1, SEGo4=Vss,    -   b) All other signals=0V to ensure that only the 4 KB Even        CACHEmsb is selected for performing charge-sharing.

Arrow 4o (#4o): The #4e step is to perform the similar Vinh/Vss analogsensing and amplification of lastly stored 4 KB MSB Odd page data storedin 4 KB Odd CACHEmsb by 4 KB Multipliers first and then 4 KB SAs. Thepreferred bias conditions are opposite to #4e step as summarized below:

-   -   a) SEGo4=H1, SEGe4=Vss,    -   b) All other signals=0V to ensure that only one 4 KB Odd        CACHEmsb is selected for performing CS with 4 KB metal2 GBLs.

Arrow 5e (#5e): The #5e step is to do Vinh precharge step for writingback the last readout 4 KB Even MSB page data from 4 KB Even CACHEmsband later stored in 4 KB SAs to 4 KB CACHEmsb again. Since A-statePGM-VFY operation is an interim step, thus the 4 KB MSB Even page datahas to be stored back to 4 KB Even CACHEmsb again for the need ofsubsequent Even LSB PGM-VFY operation plus the last 4 KB Even CACHEmsbhas been corrupted 4 KB Even MSB page due to CS. The #5e write back stepis like the previous LD/LT step, a Vinh precharge on 4 KB Even CACHEmsbis required. The preferred bias conditions are listed below:

-   -   a) PREe4=H1, PREo4=Vss and        -   LBLps4=V2=Vinh.    -   b) All other signals=0V to ensure that only one set of 4 KB Even        CACHEmsb LBLs is exclusively connected to LBLps4.

Arrow 5o (#5o): The #5o step is to do similar Vinh precharge for writingback the last readout 4 KB Odd MSB page data from 4 KB Odd CACHEmsb andlater stored in 4 KB SAs to 4 KB CACHEmsb again. Since A-state PGM-VFYoperation is an interim step, thus the 4 KB MSB Odd page data like Evenpage has to be stored back to 4 KB Odd CACHEmsb again for the need ofsubsequent Odd LSB PGM-VFY operation plus the last 4 KB Odd CACHEmsb hasbeen corrupted 4 KB Odd MSB page due to CS. The #5o write back step islike the previous #5e step, a Vinh precharge on 4 KB Odd CACHEmsb isrequired. The preferred bias conditions are also opposite to $5e step:

-   -   a) PREo4=H1, PREe4=Vss and same LBLps4=V2=Vinh.    -   b) All other signals=0V to ensure that only one set of 4 KB Odd        CACHEmsb LBLs is exclusively connected to LBLps4.

Note, in some cases, both 4 KB Odd and 4 KB Even CACHEmsb can be donewith Vinh precharging on the same time. But in this case, it ispreferred to be performed in different time slot.

Arrow 6e (#6e): The #6e step is to write back 4 KB Even MSB page andlatch from 4 KB SA to 4 KB Even CACHEmsb. The preferred bias conditionsbelow:

-   -   a) SEGe4=H1, PREo4=Vss and LBLps4=0V.    -   b) All other signals=0V to ensure that only one set of 4 KB Even        CACHEmsb LBLs is exclusively connected to 4 KB GBLs.

Arrow 6o (#6o): The #6o step is to similarly write back 4 KB Odd MSBpage and latch from 4 KB SA to 4 KB Odd CACHEmsb. The preferred biasconditions are listed below:

-   -   a) SEGo4=H1, PREe4=Vss and LBLps4=0V.    -   b) All other signals=0V to ensure that only one set of 4 KB Odd        CACHEmsb LBLs is exclusively connected to 4 KB GBLs.

Arrow 7e (#7e): The #4e step is to perform Vinh/Vss analog sensing andamplification of lastly stored 4 KB Even page Vtamin-read data stored in4 KB Even CACHEcel by 4 KB Multipliers first, then next 4 KB SAs, andlastly 4 KB P/RB. The preferred bias conditions are summarized below:

-   -   a) SEGe1=H1, SEGo1=Vss,    -   b) All other signals=0V to ensure that only the 4 KB Even        CACHEcel is selected for performing CS.

Arrow 7o (#7o): The #7o step is to perform the similar Vinh/Vss analogsensing and amplification of lastly stored 4 KB Odd page Vtamin-readdata stored in 4 KB Odd CACHEmsb by 4 KB Multipliers first, then next 4KB SAs and then lastly 4 KB P/RB. The preferred bias conditions areopposite to #7e step as summarized below:

-   -   a) SEGo1=H1, SEGe1=Vss,    -   b) All other signals=0V to ensure that only one 4 KB Odd        CACHEcel is selected for performing CS with 4 KB metal2 GBLs.

Arrow 8e (#8e): The #8e step is to do Vinh precharge for writing backthe last readout 4 KB Even Vtamin-read page data from 4 KB Even CACHEceland later stored in 4 KB P/RB to original 4 KB CACHEcel as well as 4 KBCACHEint again. The write back of #8e step is like the previous LD/LTstep, a Vinh precharge on both 4 KB Even CACHEcel and 4 KB Even CACHEintare required. The preferred bias conditions are listed below:

-   -   a) PREe1=H1, PREo1=Vss and LBLps1=V2=Vinh,    -   b) PREe2=H1, PREo2=Vss and LBLps2=V2=Vinh,    -   c) All other signals=0V to ensure that only two sets of 4 KB        Even CACHEcel and CACHEint LBLs are exclusively connected to        LBLps1 and LBLps2 respectively.

Note, all Vinh precharge is a self-timed step using one VLBLps1 VinhDetector and one VLBLps2 Vinh Detector. In addition, both 4 KB EvenCACHEcel and 4 KB CACHEint are precharged on the same time to save timedelay. In this case, both LBLps1 and LBLps2 can be combined in one anduse only one VLBLps Detector.

Arrow 8o (#8o): The #8o step is similar to #8e step with the preferredbias conditions opposite to $8e step:

-   -   a) PREo1=H1, PREe1=Vss and LBLps1=V2=Vinh,    -   b) PREo2=H1, PREe2=Vss and LBLps2=V2=Vinh,    -   c) All other signals=0V to ensure that only two sets of 4 KB        Even CACHEcel and CACHEint LBLs are exclusively connected to        LBLps1 and LBLps2 respectively.

Note, all Vinh precharge is a self-timed step using one VLBLps1 VinhDector and one VLBLps2 Vinh Detector.

Arrow 9e (#9e): The #9e step is to write back 4 KB Even Vtamin-read pageand latch from 4 KB P/RB simultaneously on both 4 KB Even CACHEcel and 4KB CACHEint via 4 KB GBLs to save time and ½ step. The preferred biasconditions are listed below:

-   -   a) SEGe1=H1, PREo1=Vss and LBLps1=0V,    -   b) SEGe2=H1, PREo2=Vss and LBLps2=0V,    -   c) All other signals=0V to ensure that only one set of 4 KB Even        CACHEcel and 4 KB

Odd CACHEint LBLs are exclusively connected to 4 KB GBL lines forwriting back Vtamin-read Even page data.

Arrow 9o (#9o): The #90 step is to similarly write back 4 KB OddVtamin-read page and latch from 4 KB P/RB to both 4 KB Odd CACHEcel and4 KB CACHEint simultaneously to save time, power and ½ step. Thepreferred bias conditions of #90 step are opposite to #9e step:

-   -   a) SEGo1=H1, PREe1=Vss and LBLps1=0V,    -   b) SEGo2=H1, PREe2=Vss and LBLps2=0V,    -   c) All other signals=0V to ensure that only one set of 4 KB Odd        CACHEcel and 4 KB

Odd CACHEint LBLs are exclusively connected to 4 KB GBLs for writingback Vtamin-read Even page data.

After above 9 steps, all 8 KB pages data of LSB, Vtamin-read (MSB-bit)are finally stored back to the corresponding CACHE1sb and CACHEmsb, andthe updated and old iterative A-state Program page data are stored inrespective 8 KB CACHEcel and 8 KB CACHEint for continuing next iterativeA-state Program and PGM-VFY operations. In other words, 8 KB CACHEcel isalways used to store the most updated A-state PGM-VFY page data ofcurrent iterative step but 8 KB CACHEint is always used to store thelastly updated A-state in last iterative PGM-VFY page data.

FIG. 7C is a diagram showing circuit structure and method for m-page MLC(LSB page) B-state Program-Verify operation according to an embodimentof the present invention. As shown, this is a second flow of m-page LSBEven B-cell Program-Verify operation to program the selected B1′-statecells to B-state cells by using a Vtbmin as a m-page Program-Verify WLvoltage in all selected m pages when the corresponding inputted bit dataof LSB=1 but adjusted to “0” and MSB=0. Like A-state Block PGM-VFYmethodology shown in FIG. 7B, this methodology is used along with theflows shown in FIG. 7J and FIG. 7K to explain the preferred m-page MLC(LSB page) B-state PGM-VFY operation.

In an embodiment, the B-state PGM-VFY methodology is also based on ascenario that m-page of MLC's MSB and LSB A-state have been programmedand verified successfully in all selected MLC cells in m selectedMLC-WLs with a condition of Flag cell bit=0 in each selected MLC-WL.Note, this methodology is designed based on a fixed rule that each MSBbit Program has to be performed before each LSB bit Program and A-stateis Program-Verified before B-state PGM-VFY operation so that m selectedsets of 64WL+1SSL+1GSL-latched Vread and Vdd voltages in A-state can beused in a recyclable manner for next (a second) B-state PGM-VFYoperation. Thus power and latency savings of the second B-state PGM-VFYoperation over the first A-state PGM-VFY operation can be achieved. Thesetting from Vtamin for A-state and Vtbmin for B-state for each set of64WL+1SSL+1GSL will be explained subsequently.

When dealing with A-state Program-Verify, 8 KB internal MSB page dataunder Vtamin-read are stored in 8 KB CACHEmsb as indicated in #6e and#6o steps in FIG. 7B. Conversely, when dealing with B-stateProgram-Verify, 8 KB LSB page data sequentially loaded from 8 I/Os arestored in 2-cycle (shown below) in 8 KB CACHE1sb as indicated in #5e and#5o steps in FIG. 7C with the preferred bias conditions listed below.

-   -   a) PREe3=H1 and PREo3=Vss and LBLps3=V2=Vinh.    -   b) PREe3=H1 and PREo3=Vss and LBLps3=V2=Vinh.

In addition, the Vtamin used for A-state PGM-VFY in FIG. 7B is replacedby the Vtbmin used for B-state PGM-VFY operation as indicated in #2 ofFIG. 7C. All other control signals and steps are same for both FIG. 7Band FIG. 7C. Thus the detailed description of FIG. 7C of B-state PGM-VFYmethodology is omitted herein.

FIG. 7D is a diagram showing circuit structure and method for performingm-page MLC (LSB page) C-state Program-Verify operation according to anembodiment of the present invention. As shown, this is a third flow ofm-page LSB Even C-cell Program-Verify operation to program the selectedB1′-state or B2′-state cells to C-state cells by using a Vtcmin as am-page Program-Verify WL voltage in all selected m pages when thecorresponding inputted bit data of LSB=0 and MSB=0. Basically, thismethodology is similar to precious methodologies of FIG. 7A and FIG. 7Bbut with fewer steps because this is the last check of C-state of m-page4-Vt MLC (LSB) PGM-VFY operation. This methodology is used along withthe flows shown in FIG. 7H and FIG. 7I to explain the preferred m-pageMLC (LSB page) C-state PGM-VFY operation.

In an embodiment, the C-state PGM-VFY methodology is based on onescenario that m-page of MLC MSB and LSB bits have been programmed andverified successfully with A-state and B-state in all selected MLC cellsin m selected MLC-WLs with a condition of Flag cell bit=0 in eachselected MLC-WL. For this preferred m-page MLC (LSB page) C-statePGM-VFY operation, it comprises 6 steps indicated by arrow # such as 1,2, 3e, 3o, 4e, 4o, 5e, 6e and 6o with several basic operations similarto those explained before.

In a specific embodiment, the C-state PGM-VFY methodology includesprecharging only two of CACHEcel and CACHEint, discharging selectiveC_(LBL) capacitors, loading and latching 8 KB updated C-state page datafrom 4 KB P/RB in 2-cycle into both 8 KB CACHEcel and 8 KB CACHEint, andcharge-sharing the C_(LBL) capacitors with corresponding GBLs. Inparticular, as indicated by Arrow #1 and #5, only two pseudo CACHEceland CACHEint registers are selected for precharging at differenttimeline with Vinh for C-state PGM-VFY operation. In the dischargingstep, indicated by Arrow 2, the selective C_(LBL) capacitors in theselected CACHEcel and CACHEint are either discharged to Vss or retainedwith Vinh in accordance with the new readout MSB page data after A-statePGM-VFY operation. Next in the loading and latching (LD/LT) step,indicated by Arrow 6, 8 KB updated C-state page data are loaded from 4KB P/RB in 2-cycle into both 8 KB CACHEcel and 8 KB CACHEint which aretemporarily latched therein. Lastly, in charge-sharing step, indicatedby Arrows 3 and 4, the 8 KB CACHEcel are used to store the updated ornew 8 KB C-state Program page data after each iterative PGM-VFY step butthe 8 KB CACHEint are used to store the last or odd C-state Program pagedata after last iterative PGM-VFY step.

Arrow 1 (#1): As #1 indicates only 4 KB Even CLBL1e and 4 KB CLBL1ocapacitors of CACHEcel are selected for Vinh precharging with thefollowing preferred bias conditions.

-   -   a) PREo1=PREe1=H1 with LBLps1=V2=Vinh        -   These conditions are to precharge both 4 KB Even and 4 KB            Odd CACHEcel C_(LBL) capacitors simultaneously within            1-cycle of a self-timed step.    -   b) LBLps2=LBLps3=LBLps4=0V    -   c) PREo2=PREe2=PREo3=PREe3=PREo4=PREe4=0V No other 8 KB CACHEs        to be precharged.    -   d) TIE1= . . . =TIEL/2=0V, where L=4 This is to prevent leakage        from 8 KB CACHEcel to 8 KB CACHEint.    -   e) SEGo1=SEGe1=0V        -   This is to prevent leakage of 4 KB Even CACHEcel and 4 KB            Odd CACHEcel to the common 4 KB GBL lines.    -   f) All other SEGo2/3/4=SEGe2/3/4=0V because not required.

Arrow 2 (#2): This #2 step performs 8 KB C-state PGM-VFY operation thatresults in either C_(LBL) voltage dropped to Vss due to discharging orC_(LBL) voltage retained Vinh due to no discharging, depending on cells'states in CACHEcel. For example, under selected WL=Vtcmin,E-cells=A-cells=B-cells=Vss because Vtemax<Vtamin<Vtbmin<Vtcmin butC-cells=Vinh because no cell current conduction Vtcmin≦Vtc. Inconclusion, after this step, the 8 KB MLC analog Cell voltages,E-cells=A-cells=B-cell=Vss are differentiated from C-cells=Vinh and arestored herein m 8 KB CACHEcel.

Arrow 3e (#3e): The #3e step is to perform Vinh/Vss analog sensing andamplification of last or old 4 KB Even C-state Program data stored in 4KB Even CACHEint by 4 KB Multipliers first and then 4 KB SAs thereafter.The preferred bias conditions are summarized below:

-   -   a) SEGe2=H1, SEGo2=Vss,    -   b) All other signals=0V to ensure that only the 4 KB Even        CACHEint is selected for performing CS.

Arrow 3o (#3o): The #3o step is to perform the similar Vinh/Vss analogsensing and amplification of last or old 4 KB Odd C-state Program pagedata stored in 4 KB Odd CACHEint by 4 KB Multipliers first and then 4 KBSAs. The preferred bias conditions are opposite to #3e step above andare summarized below:

-   -   a) SEGo2=H1, SEGe2=Vss,    -   b) All other signals=0V to ensure that only one 4 KB Odd        CACHEint is selected for performing CS with 4 KB metal2 GBLs.

Arrow 4e (#4e): The #4e step is to perform Vinh/Vss analog sensing andamplification of current or new 4 KB Even C-state Program data stored in4 KB Even CACHEcel by 4 KB Multipliers first and then 4 KB SAsthereafter. The preferred bias conditions are summarized below:

-   -   a) SEGe1=H1, SEGo1=Vss,    -   b) All other signals=0V to ensure that only the 4 KB Even        CACHEcel is selected for performing CS.

Arrow 4o (#4o): The #4o step is to perform a similar Vinh/Vss analogsensing and amplification of current or new 4 KB Odd C-state Programpage data stored in 4 KB Odd CACHEcel by 4 KB Multipliers first and then4 KB SAs. The preferred bias conditions are opposite to #4e step and aresummarized below:

-   -   a) SEGo1=H1, SEGe1=Vss,    -   b) All other signals=0V to ensure only one 4 KB Odd CACHEcel are        selected for performing CS with 4 KB metal2 GBLs.

Arrow 5e (#5e): The #5e step is to do Vinh precharge step for writingback the updated 4 KB Even C-state Program page data from 4 KB P/RB toboth Even CACHEcel and CACHEint for next iterative Program and PGM-VFYoperation if the current C-state PGM-VFY does not pass. The preferredbias conditions are summarized below:

-   -   a) PREe1=PREe2=H1 and LBLps1=LBLps2=V2=Vinh,        -   This is to precharge both 4 KB Even CACHEcel and 4 KB Even            CACHEint with Vinh only.    -   b) PREo1=PREo2=Vss        -   This is to prevent the bus contentions from happening            between 4 KB Odd CACHEcel and 4 KB Odd CACHEint and 4 KB            GBLs.    -   c) All other signals=0V to ensure the writing back only happens        to above said 4 KB Even CACHEcel and 4 KB Even CACHEint.

Arrow 5o (#5o): The #5o step is a similar Vinh precharge step forwriting back the updated 4 KB Odd C-state page data from 4 KB P/RB toboth 4 KB Odd CACHEcel and 4 KB Odd CACHEint. The preferred biasedconditions are also opposite to #5e step:

-   -   a) PREo1=PREo2=H1 and LBLps1=LBLps2=V2=Vinh, To precharge both 4        KB Odd CACHEcel and 4 KB Odd CACHEint with Vinh only.    -   b) PREe1=PREe2=Vss        -   To prevent the bus contentions from happening between 4 KB            Even CACHEcel and 4 KB Even CACHEint and 4 KB GBLs.    -   c) All other signals=0V to ensure the writing back only happens        to above said 4 KB Odd CACHEcel and 4 KB Odd CACHEint.

Arrow 6e (#6e): The #6e step is to do LD/LT step for writing back theupdated C-state Even Program page data from 4 KB P/RB to both 4 KB EvenCACHEcel and 4 KB Even CACHEint simultaneously in 1-cycle of aself-timed operation. The preferred bias conditions are listed below:

-   -   a) SEGe1=SEGe2=V1=Vdd to allow the digital Vdd/Vss to analog        Vinh/Vss conversion to the selected 4 KB Even CACHEcel and 4 KB        Even CACHEint C_(LBL) capacitors.    -   b) All other signals=0V to ensure only two sets of 4 KB Even        CACHEcel and 4 KB Odd CACHEint LBLs are exclusively connected to        4 KB P/RB via 4 KB GBLs.

Arrow 6o (#6o): The #6o step is similar to #6e step with the preferredbias conditions also opposite to #6e step:

-   -   a) SEGo1=SEGo2=V1=Vdd to allow the digital Vdd/Vss to analog        Vinh/Vss conversion to the selected 4 KB Odd CACHEcel and 4 KB        Odd CACHEint C_(LBL) capacitors.    -   b) All other signals=0V to ensure only two sets of 4 KB Odd        CACHEcel and 4 KB Odd CACHEint LBLs are exclusively connected to        4 KB P/RB via 4 KB GBLs.

FIG. 7E is a flow chart showing a method for performing m-page MLC (LSBEven page) Data Loading and B′-adjustment according to an embodiment ofthe present invention. Basically, the detailed steps of this flow aredesigned in accordance with the steps set by the preferred Even LSB pagedata loading and B′-adjustment methodology as explained in FIG. 7A asthe circuits of HiNAND2 array in FIG. 1A, Block-decoder in FIG. 2A,Segment-decoder in FIG. 2B and Data Buffer in FIG. 3.

As shown, the flow in FIG. 7E starts with a first step of m Even pagesof 8 KB 4-Vt MLC LSB raw data loading from 8 external I/Os followed by asecond step of a preferred B′-data adjustment logic operation to obtainm newly readjusted correct Even LSB page data to be stored into mdesignated pseudo Even CACHE registers for the subsequent concurrentm-page MLC LSB Even page Program operation.

Like previous SLC/MLC (MSB) Page loading and Program operation, m pagesof each 8 KB MLC LSB data loading have to be split into m 4 KB Even andm 4 KB Odd pages to accommodate for 4 KB GBL bus lines. This flow showsm Even LSB pages' loading and B′ bit adjustment only. But the preferredProgram and Program-Verify of m LSB pages will be done on both m 4 KBEven and m 4 KB Odd pages simultaneously to get m-fold reduction time bythe present invention. Note, since the present invention is disclosedfor a hybrid SLC-WL and MLC-WL 64-WL NAND Block of HiNAND2 array, thusMLC LSB page data loading and B′-adjustment are only performed on thosem selected MLC-WLs with 4-Vt cells. The flow starts from Step 430.

Step 430: This step is to sequentially receive and load MLC LSB commandand m LSB page addresses into HiNAND2 array designated Command register,and m Address Buffers (not shown), m latches of m Segments, and mlatches of m Blocks via 8 I/Os respectively. For example, the LSBCommand is loaded into the designated Command register so that this newBlock LSB command can be decoded first and some of the initial steps ofm 8 KB LSB B′-adjustment operation can be started immediately before orwhile m pages of lengthy 8 KB LSB page data are loaded into 4 KB realCACHE registers to save time.

Note, m random-page Addresses are loaded into m designated on-chip mAddress Buffers (not shown) in conjunction with other control circuitsto set the corresponding m Segment latches as shown in FIG. 2B and mBlock latches as shown in FIG. 2A of the preferred HiNAND2 array.

In addition, m addressed 8 KB LSB page data are divided into m 4 KB EvenLSB and m 4 KB Odd LSB page data. These m pages of LSB page data areselected concurrently by m Segment latches with m Block latches. In anembodiment, this HiNAND2 array comprises m LSB page Addresses that canbe selected concurrently in every selected NAND Plane. Thus aflexibility of up to m pages of Addresses can be specified in this novelLSB Program command.

Once LSB is commanded and addresses are loaded and decoded, then twonext steps can be performed concurrently. One flow moves to Step 431 todo the desired concurrent m-page C_(LBL) precharging in four CACHEs andthe other flow moves to Step 435 to start sequential loading each 4 KBEven LSB page data in unit of byte from the external Flash Controllervia 8 I/Os.

Step 431: Concurrent LBL Vinh precharging. This step is a one of thepreferred m-page operation used to perform concurrent LBL Vinh prechargeon both Even and Odd C_(LBL) capacitors within four selected CACHEs suchas 8 KB CACHEcel, 8 KB Even CACHEint, 8 KB Even CACHEmsb, and 8 KBCACHE1sb with following preferred bias conditions:

-   -   I. Self-timed Vinh precharging on C_(LBL) capacitors:        -   It is done by LBLps Vinh Detector circuit along with the            following preferred bias conditions.        -   a) PREe1=PREo1=H1 and LBLps1=Vinh for precharging CACHEcel 4            KB CLBL1e and 4 KB CLBL1o capacitors,        -   b) PREe2=PREo2=H1 and LBLps2=Vinh for precharging CACHEint 4            KB CLBL1e and 4 KB CLBL1o capacitors,        -   c) PREe3=PREo3=H1 and LBLps3=Vinh for precharging CACHE1sb 4            KB CLBL1e and 4 KB CLBL1o capacitors,        -   d) PREe4=PREo4=H1 and LBLps4=Vinh for precharging CACHEmsb 4            KB CLBL1e and 4 KB CLBL1o capacitors.    -   II. Preventing leakage from C_(LBL) to C_(GBL):        -   a) DIVen=SEGo1=SEGe1=0V for preventing the leakage of 4 KB            CLBL1o and 4 KB CLBL1e capacitors in CACHEcel to the common            4 KB GBLs,        -   b) DIVen=SEGo2=SEGe2=0V for preventing the leakage of 4 KB            CLBL1o and 4 KB CLBL1e capacitors in CACHEint to the common            4 KB GBLs,        -   c) DIVen=SEGo2=SEGe2=0V for preventing the leakage of 4 KB            CLBL1o and 4 KB CLBL1e capacitors in CACHE1sb to the common            4 KB GBLs,        -   d) DIVen=SEGo2=SEGe2=0V for preventing the leakage of 4 KB            CLBL1o and 4 KB CLBL1e capacitors in CACHEmsb to the common            4 KB GBLs.    -   III. Independent Vinh C_(LBL) precharging between the paired        CACHEs.        -   a) TIE1=0V for CACHEcel and CACHEint independent Vinh            precharging,    -   b) TIEL/2=0V (L=4) for CACHE1sb and CACHEmsb independent Vinh        precharging.

Step 432: XT bus checking. This decision step checks if the common buslines of 64XTs+1SSLp+1GSLp are occupied by some existing concurrentoperations after Vinh-precharging step. If No, then loading 4 KB EvenLSB page data stored in 4 KB real CACHE into 4 KB Even pseudo CACHE islooped to wait until all XT bus lines are free and available. If Yes,then the flow moves to Step 433 to set the corresponding Flag-latches inselected Segment-decoder and Block-decoder without waiting to save time.Note, each XT bus line means one set of shared 64XTs+1SSLp+1GSLp metallines.

Step 433: Flag-latch setting in selected Blocks and Segments. Once XTbus is released and free, m Flag latches of newly selected m Blocks in mSegments can be selectively set on Segment-by-Segment basis inaccordance with the newly loaded m-page Addresses and the followingpreferred bias conditions:

-   -   a) ENB=1=Vdd,    -   b) CLWL=CLA=CLR=0V,    -   c) ENS=one-shot pulse of Vdd.

The one-shot ENS will set each Block flag latch to make XD node at Vddonce each selected Block's 3 inputs of Pi, Qj, and Sk are matched. Eachflag latch is made of INV3 and INV4 with its output node XD is gated bya common signal of CLWL (see FIG. 2A).

Step 434: A first self-timed Charging & Latching of one set of64WLs+1SSL+1GSL under Vtamin-read condition. Once m random pages' flaglatches are selected and set, then the selected sets of 64WLs+1SSL+1GSLwill be charged with the predetermined voltages accordingly for thepurpose of subsequent B′-adjustment. The whole precharge operation ofeach set is one self-timed automatically controlled by one correspondingVread Detector per Segment. This Vread Detector is preferably connectedto one end of a dummy WL, which is the chosen middle WL of a 3-WL IClayout to track each WL's resistance and capacitance that includes theadjacent WL-WL parasitic capacitance.

Since the desired voltages of each set of 64WLs+1SSL+1GSL are coupledfrom one common set of 64XTs+1SSLp+1GSLp bus lines, thus the voltagessetup of each set of 64WLs+1SSL+1GSL have to be done one-set by one-setbasis at a time. As a result, setting voltages of m random sets of64WLs+1SSL+1GSL is impossible to be done in 1-cycle of this m-pageProgram for both m random and non-random WLs. In a specific embodiment,a two-step method for setting voltages of m random sets of64WLs+1SSL+1GSL is proposed for m random-page Block Program andProgram-Verify operations. It is merely an example for illustrativepurpose and should not be limited for the scope of claims. In the firststep, each set of 64XTs+1SSLp+1GSLp with varied voltages is set. In thesecond step, the set of 64XTs+15SLp+1GSLp bus lines is connected to eachset of 64WLs+1SSL+1GSL for charging. The varied voltages of each set of64WLs+1SSL+1GSL mean to charge 1 selected WL with a Vtamin, 63non-selected WLs with a same Vread set to 4V-6V, 1 SSL set to Vdd, and 1GSL set to Vread for whole 8 KB selected 64-cell-String Blocks inaccordance with the same voltages setting of one parent set of64XTs+1SSLp+1GSLp bus lines with HXD node at a voltage equal or greaterthan Vread+Vt due to that a local pump is enabled by the selected Flaglatches.

The voltage of each HXD signal plays an important bridge role betweenabove set of 64XTs+1SSLp+1GSLp metal lines and set of 64WLs+1SSL+1GSLpoly lines for each selected Block of each selected Segment. During Step434, when the common gate HXD signal is pumped to Vread+Vt, then thevoltages of Vread, Vtamin and Vdd in each set of 64XTs+1SSLp+1GSLp canbe fully coupled to 64WLs+1SSL+1GSL lines without any voltage dropbecause Vread is the highest voltage of WL, SSL, and GSL during thisVtamin Read step. Note, Vtamin Read is to retrieve the MSB bit data thatare programmed in MLC cell during m-page MSB-bit Program.

Conversely, when HXD node is set to be Vss, a Vread Detector detectsVread at the end of dummy WL, then the desired B′-state read voltages ofVread, Vdd, and Vtamin would be latched on the corresponding largeparasitic poly2 capacitors of respective lines corresponding to one setof 64WLs+1SSL+1GSL for a long time. Thus the subsequent m-page LSB pageB′-adjustment step can be executed concurrently. Thus this flow isdesigned to read out one 4 KB Even MSB page data first and then another4 KB Odd MSB page data thereafter.

Step 435: A first self-timed C_(LBL) Vinh-discharging-or-retaining step.This step is another self-timed operation to discharge LBL to Vss orretain LBL Vinh voltages by setting Vtamin for each selected WL inaccordance with m-page 8 KB MSB page data stored one WL of the selected8 KB WLs in one of 8 KB CACHEcel. Again, this is another operation ofthe present invention to be performed like ABL (All-BL) scheme on bothEven and Odd MSB bits Read operation per one physical MLC-WL.

The final value of each C_(LBL) bit voltage is determined by each NANDcell's state in CACHEcel. If each NAND cell is in E-state, then eachcorresponding C_(LBL)'s Vinh voltage will be discharged to Vss,otherwise each C_(LBL) precharged voltage would retain its initial Vinhvalue if those NAND cells are B′-state with Vtb′min>Vtamin as summarizedbelow.

-   -   a) E-cell=0V in CACHEcel=CACHEint because Vtamin>Vtemax>−0.5V        and TIE1=Vdd.    -   b) B′-cell=Vinh in CACHEcel=CACHEint because Vtb′min>Vtemax.        -   The preferred bias conditions are listed below:    -   a) CSL=PREe=PREo=LBLps=0V        -   It means all 4 KB PREe1=4 KB REPe2=4 KB PREo1=4 KB            PREo2=Vss. This is to prevent C_(LBL) leakage of both            CACHEcel and CACHEint to the common LBLps lines.    -   b) SEGo=SEGe=0V        -   This is to prevent metal1 C_(LBL) leakage of both CACHEcel            and CACHEint to the common metal2 4 KB GBLs.    -   c) DIVen=CSL=0V,    -   d) TIE1=Vdd,        -   Since the selected E-cells are in CACHEcel only, thus            CACHEcel discharges Vinh to Vss. But by setting TIE1=Vdd to            connect CACHEcel and CACHEint, then Vinh in CACHEint will be            discharged to Vss according to E-cells in each corresponding            bit of CACHEcel.    -   e) TIE2=0 because no cells of CACHE1sb and CACHEmsb are selected        for MSB-evaluation. Thus no Vinh discharging will happen to        CACHEmsb and CACHE1sb and no need to turn on TIE2.

Step 436: XT-free decision check. Right after the desired voltages of64WLs+1SSL+1GSL being latched on the parasitic capacitors of64WLs+1SSL+1GSL, then XT bus is freed by the setting of aboveB′-adjustment operation temporarily and now is open for any interruptedcalls from other operation with higher priority. If no such new urgentquest, then the flow moves to Step 438 to continue the next Vtbmin readstep of B′-bit adjustment of same Even LSB page.

Step 438: A second self-timed setting & latching of 64WLs+1SSL+1GSLvoltages under Vtbmin-Read condition. Unlike Step 434, the firstself-timed Charging & Latching of one set of 64WLs+1SSL+1GSL underVtamin-Read condition, to differentiate E-state cells of B′-state cellsunder Vtamin second verify voltage, this Step 438 continues to furtherdifferentiate cells of E-state and B1′-state cells from B2′-state cellsunder new Vtbmin. Since only the voltage of one selected WL=Vtamin ischanged to Vtbmin with the rest of other 63WLs+1SSL+1GSL voltages beingretained, thus the second setting of one set of 64WLs+1SSL+1GSL isdifferent from the first setting because no more 63 HV Vread chargingoperations are required. Only one LV change from Vtamin to Vtbmin needsto be done. Thus the second WL precharging can be performed much fasterwith less power consumption. That is why the latched voltages of thelast selected group of 64WLs+1SSL+1GSL lines are kept under Vtamin. Thereason for saving lastly precharged voltages of m sets of64WLs+1SSL+1GSL is to save power because there is no need to supply therequired Vread and H1 voltages again within the short interval betweenStep 434 and Step 438. In first Vtamin charging process, the pump clockof WLPH is enabled so that m selected HXD nodes can be pumped higherthan Vread+Vt with VHV>Vread in accordance with Segment decoder circuitshown in FIG. 2B.

Setting the second voltages for m sets of 64WLs+1SSL+1GSL is performed,as the first step, to set 64XTs+1SSLp+1GSLp bus lines first with oneselected XT=Vtbmin but the rest of 63 XTs=Vread, SSLp=Vdd, andGSLp=Vread with WLPH clock signal being enabled. Then, whenHXD≧Vread+Vt, the voltages of a set of 64WLs+1SSL+1GSL correspondinglyequal to voltages of 64XTs+1SSLp+1GSLp bus lines. Since the voltages of63 unselected WLs and 1SSL+1GSL are retained, thus there is only onecurrent flow in one selected WL to charge up from Vtamin to Vtbmin. In acase of Vtamin=0.5V and Vtbmin=1.5V, then 1.8V Vdd is strong enough tocomplete this setting within 100 μs. For m randomly selected WLs, thenit takes m cycles to complete all second WL charging. Note, since Vreadis already there for this second charging step, thus there is no need ofVread detection but Vtbmin detection at end of dummy WL for a secondself-timed WL charging control.

Step 439: A second self-timed C_(LBL) Vinh Discharging and Retaining.This step is like Step 435 as a second self-timed C_(LBL) Vinhdischarging or retaining operation after Step 438 latches Vtbmin andVread in the corresponding WLs, SSL, and GSL lines. Again, each finalC_(LBL) bit voltage is determined by each NAND cell's state in eachCACHEcel as summarized below:

-   -   a) B2′-cell=N-cell=Vinh in CACHEcel only because Vtb′2        min>Vtb′min and TIE1=0V    -   b) E-cell=A-cell=B1′-cell=0V and TIE2=0V because Vtb′2        min>Vtb′min>Vtamin.

Other preferred bias conditions are identical to the ones in Step 435,thus omitted here for description simplicity.

After Step 435 and Step 439, the both Even and Odd A-state of MSB pagedata under Vtamin are stored in 8 KB CACHEint and both Even and OddB-state page data are stored in 8 KB CACHEcel, preparing for subsequentLSB B′-bit adjustment.

Step 440: This is a self-timed step to fully discharge the latched HV orLV on m selected sets of 64WLs+1SSL+1GSL of m selected Blocks after Step439 to reduce Vread WL HV-stress on the m selected Blocks. Theself-timed control is done by the VLBLps Detector when the selected 4 KBC_(LBL) voltage dropping below a preset value is detected. The HXD nodesof corresponding m selected Block decoders would be set to Vdd from Vssby setting CLWL signal to Vdd and XD node to Vdd by each selected Flagbit. As a result, the 64WLs+1SSL+1GSL=Vss when 64XTs+1SSLp+1GSLp are setto 0V. The preferred bias conditions are listed below:

-   -   a) CLA=CLR=ENS=LBLps=0V,    -   b) SSLp=GSLp=XT1-XT64=0V,    -   c) TIE1˜TIEL/2=0V (L=4) and ENB=1,    -   d) CLWL=One-shot pulse of Vdd.

Note, this Step 440 can be performed concurrently with Step 438 becausethere is no WL and BL bus contention issue.

Step 441: This step is to sequentially load the external one Even 4 KBraw LSB page data into on-chip one 4 KB real CACHE register on 1-byte by1-byte basis if 8 I/Os are used in accordance with the bias conditionsshown below. Note, each raw LSB bit data is not associated with a B′-bitadjustment. The raw LSB bit is just loaded from 8 I/O directly.

-   -   a) EQ=0V        -   No need of pre-equalization of each real CACHE bit in            advance.    -   b) RD=LD=0V        -   To isolate each real CACHE bit from each Multiplier, each SA            and each P/RB for safer loading in accordance with the            circuit shown in FIG. 3.

Note, this step can be performed simultaneously with Step 430 becausethere is no 64XTs+1SSLp+1GSLp and 4 KB GBL bus contention issue. Theloading of 4 KB LSB data only happens between 4 KB real CACHE and 8I/Os, while Vinh discharging or retaining happens in m selected localLBLs and C_(LBL) capacitors. As a consequence, the bus lines of 4 KBGBLs and 64XTs+1SSLp+1GSLp are free and available for other concurrentm-page operations with priority.

Step 442: This decision step is to check if the last byte of each 4 KBEven LSB page are loaded completely into 4 KB real CACHE. If No, thenStep 442 is looped to wait for the completion. If Yes, then the flowmoves to next two steps of either 443 or 444.

Step 443: Thus step is to inform an off-chip Flash Controller or Host,by setting RDY=Vss, the on-chip 4 KB real CACHE register is temporarilyfull at this moment with the following preferred conditions to isolate 4KB real CACHE from I/Os:

-   -   a) LAT=0V    -   b) All Ypass=0V.

As a result, the HiNAND2 array cannot take any new command, data,address from 8 I/Os. The busy state can be reset by setting RDY=Vdd whenthe current B′-adjustment step is completed or a 4 KB Even page data in4 KB real CACHE register is loaded and latched in the designated pseudoCACHEs.

Step 444: Before loading and latching (LD/LT) of 4 KB Even LSB page datastored in 4 KB real CACHE into on-chip 4 KB pseudo CACHE1sb first andthen to 4 KB SAs second, this decision step checks to wait for therelease of 4 KB metal2 GBLs that might be currently occupied by one ofconcurrent Block operation with the preferred biased conditions. If 4 KBGBLs are temporarily not being occupied at this moment, then the flowmoves to Step 445.

Step 445: This step is to load and latch 4 KB Even raw LSB page data in4 Kb real CACHE to one designated 4 KB Even CACHE1sb that has beenprecharged with Vinh voltages in previous Step 431 of this flow. Thepreferred bias conditions are summarized below with reference to FIG.1B.

-   -   a) LAT=0V, SEGe1=0V in 4 KB CACHEcel, SEGe2=0V in 4 KB CACHEint,        SEGe3=1=Vdd in 4 KB CACHE1sb, SEGe4=0V in 4 KB CACHEmsb,        SEGo1=SEGo2=SEGo3=SEGo4=0V,        -   The above conditions are because only 4 KB Even CACHE1sb            pseudo registers are selected to load and latch 4 KB Even            LSB raw page data.    -   b) CSL=TIE1˜TIEL/2=0V (L=4)        -   To prevent the load and latch leakage from the selected 4 KB            Even CACHE1sb to the paired Even 4 KB CACHEmsb.    -   c) DIVen=BIAS=LD=H1        -   This allows the full Vdd passage of 4 KB P/RB digital            voltages to the designated 4 KB even CACHE1sb C_(LBL)            capacitors via 4 KB bus lines.

Note, this LD/LT step also executes a Vdd/Vss-to-Vinh/Vss conversion.The Vss in each CACHE bit will pull the corresponding Vinh bit to Vss.But the Vdd in each CACHE bit will retain the corresponding Vinh bit toVinh voltage because SEGe=Vdd that forms a diode circuit to shut off theleakage in accordance with the circuit of FIG. 1A. Note, the 4 KB LSBdata is latched in 4 KB CACHE1sb in same polarity but Vdd digital bitdata is replaced with HV analog bit data stored temporarily in 4 KB EvenCACHE1sb.

Step 446: This step is to transfer each DIOi bit of each 4 KB real CACHEthat stores each of 4 KB Even LSB bit to each corresponding 4 KB SAs viaNMOS transistor 15 (with a gate tied to LD) and NMOS transistor 20 (witha gate tied to signal WRT) as shown in FIG. 3 along with the followingpreferred biased conditions.

-   -   a) BIAS=0V,        -   This condition is to isolate the signal of DIOi on the            common node of PBL from each corresponding GBL.    -   b) T3=0V is to disable one input of VRef to SA and equalization.    -   c) T4=0V,        -   This condition is to disconnect Qi node of SA from OUTP node            of Multiplier so that each DIOi bit losing into each SA            would not be affected by each Multiplier.    -   d) T5=0V,        -   This is to disable SA during DIOi loading.    -   e) WRT=LD=H1 One-shot pulse.

Step 447: This step is to reset RDY=1 to inform Off-chip FlashController, HINAND2 status now is not busy and is free to take newcommand and operations because the common 64XTs+1SSLp+1GSLp bus lines,common 4 KB GBL bus lines and even 4 KB real CACHE registers are freeand available to take any new data or concurrent operations.

Step 448: This step is to set each P/RB bit data in accordance with eachcorresponding LSB bit data stored in each SA with same bit polarity withthe following preferred bias conditions:

-   -   a) INV=IDAB=IDB=0V,    -   b) ENSB1=ENSB2        -   These conditions are to ensure Qi and QiB nodes of each SA            are connected to the gates of MN18 and MN17 of two pull-down            legs of SA made INV1 and INV2 of each P/RB.    -   c) PGM=0V        -   This step is not in Program mode.    -   d) WBK=IDC=one shot of Vdd        -   These two signals are working as a complementary input gate            signals to MN16 and MN19 respectively for this operation. If            Qi=1 (LSB bit=1) and QiB=0, then Di=1 and DiB=0.    -   e) T5=1        -   This is to enable SA.

After this step, the final bit data stored in each P/RB bit is theduplication of each Even raw LSB bit with same polarity in both SA andP/RB.

Step 449: After loading Even LSB raw page data into SA and PR/B andCACHE1sb, then the desired B′-bit adjustment starts. In order to do so,two lastly stored A-state MSB data in 4 KB Even CACHEint and B-statedata stored in 4 KB Even CACHEcel have to be read back into 4 KB DB for4 KB Even B′-bit logic adjustment in accordance with the table of B′-bitAdjustment in Even Page shown in FIG. 7G of the present invention.

This step is to read out the 4 KB Even A-state (Vtamin-read) page analogVinh/Vss data stored in 4 KB Even CACHEint into 4 KB SA through ananalog cell data sensing that suffers CS effect. Each final A-state bitdata is fully amplified into a digital data with a same polarity storedin each corresponding SA. In other words, Each Qi=0/1 of itscorresponding but data in each CACHEint=Vss/Vinh. The following are thepreferred bias conditions:

-   -   a) TIE1˜TIEL/2=CSL=SEGe3 (CACHE1sb)=0V (L=4),    -   b) DIVen=SEGe2=SEGe4 (CACHEint and CACHEmsb)=0V,    -   c) Voutp=Vref+/−ΔV,    -   d) T5=one shot of Vdd,        -   This is to do the second analog amplification for the first            analog amplification done by each Multiplier.

After this step, 4 KB SA store 4 KB Even page data of E-state andB′-state cells under Vtamin condition in the selected WL. In otherwords, the 4 KB data in 4 KB SA are the 4 KB Even MSB page data that arestored in 4 KB Even CACHEint C_(LBL) capacitors by Step 437 of thisflow.

Step 450: This step is to write back 4 KB MSB Even page data at 4 KB SAto 4 KB Even CACHEmsb C_(LBL) capacitors. The reason to do this step isto continue storing the 4 KB Even MSB data in 4 KB Even CACHEmsb forsubsequent MLC iterative Program and Program-Verify operations. The 4 KBEven MSB page data stored in 4 KB CACHEint are corrupted after beingreadout to 4 KB SA due to the CS degradation effect between each LBL andeach or up to J GBLs. This step is like Step 445, thus is omitted herefor description simplicity. The biased conditions are summarized below.

-   -   a) SEGe (CACHEcel and CACHEint)=0V        -   Both above two CACHEs are not selected for storing 4 KB Even            MSB data.    -   b) SEGe (Even CACHEmsb)=Vdd        -   Because 4 KB Even CACHEmsb are selected to store 4 KB MSB            Even age data.    -   c) SEGe (Even CACHE1sb)=0V        -   CACHE1sb is not selected for storing 4 KB Even MSB data.    -   d) TIE1˜TIEL/2=SEGo=0V (L=4),    -   e) DIVen=BIAS=WRT2=H1 on shot pulse.

The above conditions are to connect QiB node of each SA to each CACHEmsbC_(LBL) capacitor with a full Vdd passage but a reversed bit polarity.

Step 451: This step is to do bit-flipping for each B′-bit adjustment foreach Even LSB 4 Kb page data in 4 KB P/RB in accordance with thefollowing bias conditions and Table in FIG. 7G:

-   -   a) IDC=IDB=WBK=0V,        -   IDC=0V is to turn off NMOS transistor 19, IDB=0 is to turn            off NMOS transistor 8, WBK=0 is to turn off NMOS transistor            16. These combined efforts are to enable single current path            from P/RB Di node to Vss via NMOS transistor 26 and NMOS            transistor 27 only (see FIG. 3).    -   b) ENSB1=ENSB2=IDAB=PGM=0V,        -   To disconnect NMOS transistors 12 and 11 from SA's Qi and            QiB outputs.    -   c) INV=One shot of Vdd,        -   This condition is to turn on NMOS transistor 26 to connect            Di current path to Vss through NMOS transistor 27 gated by            Qi of SA.    -   d) T5=1 to enable SA.

As a result of this flipping step, B1′=B2′=0=Di, in each P/RB whenQi=1=MSB in each SA as indicated the flipping step of adjusting LSB ofB1′=B2′=0 in the table of FIG. 7G.

Step 452: Like Step 449, this step continues to read out the 4 KB EvenB-state (Vtbmin-Read) page analog Vinh/Vss data stored in 4 KB EvenCACHEcel into 4 KB SA through an analog cell data sensing that suffersCS effect. Each final B-state bit data is fully amplified into a digitaldata with a same polarity stored in each corresponding SA. In otherwords, Each Qi=0/1 of its corresponding but data in eachCACHEint=Vss/Vinh. The following are the preferred biased conditions.

-   -   a) TIE1˜TIEL/2=CSL=0V (L=4),    -   b) SEGe3 (CACHE1sb)=SEGe4 (CACHEmsb)=0V,        -   to isolate above two CACHEs to 4 KB GBLs.    -   c) DIVen=SEGe1 (CACHEcel)=H1,        -   to connect 4 KB Even ACHEce1 to 4 KB GBLs.    -   d) Voutp=Vref+/−ΔV,    -   e) T5=one-shot pulse of Vss.

After this step, 4 KB SAs store 4 KB Even B-state page data of 8 KBcells under Vtbmin condition in the selected WL.

Step 453: This step is to use newly readout 4 KB Even B-state page datain 4 KB SAs to set last B2′-bit data with reversed bit polarity. Forexample, Di=0 in each P/RB bit is flipped to “1” if each correspondingQi=1 in each SA bit in accordance with the following bias conditions:

-   -   a) IDAB=IDB=INV=0V,    -   b) ENSB1=ENSB2=WBK=PGM+0V,    -   c) IDC=one Shot of Vdd and T5=1.

After this step, then the final LSB B′-adjustment bit data in P/RB isE=1, A=0, B1′=0, B2′=1, C=0, where “0” means program but “1” meansProgram-Inhibit.

Step 454: This is the self-timed step to precharge both 4 KB EvenCACHEcel and 4 KB Even CACHEint to prepare for loading and latching 4 KBEven LSB page data after final B′-bit adjustment back to both 4 KB EvenCACHEcel and 4 KB Even CACHEint concurrently for B′ adjustment to cleanup 4 KB P/RB for next 4 KB Odd LSB page of B′ adjustment at 4 KB P/RBand 4 KB SA in accordance with the following conditions:

-   -   a) CSL=DIVen=TIE1˜TIEL/2=0V (L=4),    -   b) PREo=SEGe=SEGo=0V,    -   c) PREe=H1, LBLps=Vinh.

Step 455: This step is to do loading and latching of 4 KB Even LSB pagedata after final B′-bit adjustment from 4 KB P/RB to both 4 KB EvenCACHEcel and 4 KB Even CACHEint in 1-cycle concurrently in accordancewith the following bias conditions:

-   -   a) SEGe1 (CACHEcel)=Vdd,    -   b) SEGe2 (CACHEint)=Vdd,    -   c) SEGe3 (CACHE1sb)=0V,    -   d) SEGe4 (CACHE1sb)=0V,    -   e) DIVen=BIAS=PGM=H1 one shot,        -   Connect a path from each bit of P/RB to CACHEcel and            CACHEint.

FIG. 7F is a flow chart showing a method for performing m-page MLC (LSBOdd page) Data Loading and B′ Adjustment according to an embodiment ofthe present invention. This method follows Step 447 to sequentially loadexternal one 4 KB raw LSB Odd page data into on-chip 4 KB real CACHEregisters on byte-by-byte basis in accordance with the bias conditionsshown below. Note, each raw LSB Odd page data is directly loaded into 4KB real CACHE register from I/O directly.

-   -   a) Y-pass=1=Vdd    -   b) RD=LD=EQ=0V

To isolate each real CACHE bit from each Multiplier, each SA, and eachP/RB for safe loading in accordance with the circuit shown in FIG. 3.

Step 461: It is like Step 442. This decision step is to check if thelast byte of each 4 KB Odd LSB page is loaded completely into 4 KB realCACHE. If No, then Step 461 is looped to wait for the completion. IfYes, then the flow moves to next two concurrent steps of either 462 or463.

Step 463: It is like 443. Thus step is to inform an off-chip FlashController or Host, by setting a RDY signal to Vss, the on-chip 4 KBreal CACHE register is temporarily full at this moment with thefollowing preferred conditions to isolate 4 KB real CACHE from I/Os withdetails being omitted here.

-   -   a) LAT=0V    -   b) All Ypass=0V.

Step 462: it is like Step 444. Before loading and latching of 4 KB OddLSB page data stored in 4 KB real CACHE into on-chip 4 KB pseudoCACHE1sb first and then to 4 KB SAs second, this decision step checks towait for the release of 4 KB metal2 GBL bus lines that might becurrently occupied by one of concurrent m-page operation with thepreferred biased conditions. If 4 KB GBL bus lines are temporarily notoccupied at this moment, then the flow moves to Step 464.

This step is to load and latch 4 KB raw Odd LSB page data in 4 KB realCACHE to one designated 4 KB Odd pseudo CACHE1sb that has beenprecharged with Vinh in previous Step 431 of this flow. The preferredbias conditions are summarized below with reference to HiNAND2 array inFIG. 1A.

-   -   a) LAT=0V, SEGo1=0V in 4 KB CACHEcel, SEGo2=0V in 4 KB CACHEint,        SEGo3=1=Vdd in 4 KB CACHE1sb, SEGo4=0V in 4 KB CACHEmsb,        SEGe1=SEGe2=SEGe3=SEGe4=0V,        -   The above conditions are because only 4 KB Odd CACHE1sb            pseudo registers are selected to Ld/Lt 4 KB raw Odd LSB page            data.    -   b) CSL=TIE1=TIE2=0V,        -   To prevent the Ld/Lt leakage from the selected 4 KB Odd            CACHE1sb to the paired Odd 4 KB CACHEmsb.    -   c) DIVen=BIAS=LD=H1.

Note, the 4 KB Odd LSB data is latched in 4 KB Odd CACHE1sb in samepolarity but Vdd digital bit data is replaced with HV analog bit datastored temporarily in 4 KB Odd CACHE1sb.

Step 465: It is like Step 446. This step is to transfer each DIOi bit ofeach 4 KB real CACHE that stores each of 4 KB Odd LSB bit to eachcorresponding 4 KB SAs with the following preferred bias conditions:

-   -   a) BIAS=0V,        -   This condition is to isolate the signal of DIOi on the            common node of PRL from each corresponding GBL.    -   b) T3=0V is to disable one input of Ref to SA and equalization.    -   c) T4=0V,    -   d) T5=0V.

Step 466: It is like Step 447. This step is to reset RDY=1 to informOff-chip Flash Controller, HINAND2 array status now is temporarily notbusy now and is free to take new command and operations as Step 446.

Step 467: It is like Step 448. This step is to set each P/RB bit data inaccordance with each corresponding LSB bit data stored in each SA withsame bit polarity with the same preferred bias conditions. Thus thedetails can refer to Step 448. After this step, the final bit datastored in each P/RB bit is the duplication of each Odd raw LSB bit withsame polarity in both SA and P/RB.

Step 468: It is like Step 449. After loading Odd LSB raw page data intoSA and PR/B and CACHE1sb, then the desired B′-bit adjustment will bestarted. The details can refer to Step 449. After this step, 4 KB SAstore 4 KB Odd page data of E-state and B′-state cells under Vtamincondition in the selected WL.

Step 469: It is like Step 450. This step is to write back 4 KB MSB Oddpage data at 4 KB SA to 4 KB Odd CACHEmsb C_(LBL) capacitors. Thedetails can refer to Step 450.

The above conditions are to connect QiB node of each SA to each OddCACHEmsb C_(LBL) capacitor with a full Vdd passage but a reversed bitpolarity.

Step 471: It is like Step 451. This step is to do bit-flipping for eachB′-bit adjustment for each Odd LSB 4 KB page data in 4 KB P/RB inaccordance with what are displayed in Table of FIG. 7G. The details canrefer to Step 451.

Step 472: it is like Step 452. As Step 449, this step continues to readout the 4 KB odd B-state (Vtbmin-Read) page analog Vinh/Vss data storedin 4 KB Odd CACHEcel into 4 KB SA through an analog cell data sensingthat suffers CS effect. Each final B-state bit data is fully amplifiedinto a digital data with a same polarity stored in each correspondingSA. The details can refer to Step 452 and are omitted here. After thisstep, 4 KB SAs store 4 KB Odd B-state page data of 8 KB cells underVtbmin condition in the selected WL.

Step 473: It is like Step 453. This step is to use newly readout 4 KBOdd B-state page data in 4 KB SAs to set last B2′-bit data with reversedbit polarity. The details can refer to Step 453. After this step, thenthe final Odd LSB B′-adjustment bit data in P/RB is E=1, A=0, B1′=0,B2′=1, C=0, where “0” means Program but “1” means Program-Inhibit.

Step 474: It is like Step 454. This is the self-timed step to prechargeboth 8 KB CACHEcel and CACHEint to prepare for loading and latching 4 KBOdd LSB page data after final B′-bit adjustment back to both 4 KB OddCACHEcel and 4 KB Odd CACHEint concurrently for B′ adjustment to cleanup 4 KB P/RB for next operation. The details can refer to Step 454.

Step 475: It is like Step 455. This step is to do loading and latchingof 4 KB Odd LSB page data after final B′-bit adjustment from 4 KB P/RBto both 4 KB Odd CACHEcel and 4 KB Odd CACHEint in 1-cycle. The detailscan refer to Step 455.

Step 477: This is the decision step to check if last LSB page isprogrammed. If yes, then moves to Step 479 to receive the Programconfirmation code and then moves to Step 480. If No, then flow moves toStep 478.

Step 478: Host can issue a new command, a next LSB page address, anddata into 4 KB real CACHE for next B′-bit adjustment.

Step 480: It is like Step 434 but with different higher voltages of Vpgmand Vpass. This is a self-timed step is to set up (or charge up) severalpreferred voltages of each iterative Program operation for m randomlyselected sets of 64WLs+1SSL+1GSL lines like Step 434. The desired WLvoltages are summarized below:

-   -   a) One selected WL=Vpgm (15V to 25V)        -   An ISSP scheme will be used with 0.2V-0.4V per ΔVpgm            starting from 15V.    -   b) 63 unselected WL=Vpass (8-10V)        -   Vpass has to be high enough to turn on 63 unselected WLs'            cells even store highest C-state cells.    -   c) SSL=H1≧Vin+Vt    -   d) GSL=0V to prevent Program-inhibit Vinh leakage in CACHEcel        through NAND string.

The voltage setup of each set of 64WLs+1SSL+1GSL needs to set up thedesired voltages of each set of 64XTs+1SSLp+1GSLp lines first with thesame voltages but also set HXD≧Vpgm+Vt to make64WLs+1SSL+1GSL=64XTs+1SSLp+1GSLp correspondingly.

Since the WL Vpgm voltage of LSB Program operation is higher than theVread of B′-adjustment, the Vpgm Detector is used to replace VreadDetector for MLC (MSB or LSB) or SLC Program operation. Except voltagedifference, other steps of WL setup are similar. Thus, the details ofProgram WL setup can refer to Step 434.

Step 481: This is the self-timed step to latch the precharged64WLs+1SSL+1GSL in their respective Poly parasitic capacitors. This isdone automatically by using Vpgm Detector at one end of a dummy WL asexplained previously. Once Vpgm is detected, then the Program operationis started and self-timed Program period is counted. The program time isalso controlled by LBLPs Detector as explained previously. The flow thenmoves to Step 482.

Step 482: The HiNAND2 array also uses FN-tunneling scheme for m-pageProgram operation. It is preferred to have an All-BL Program, i.e., 8 KBNAND cells in one physical WL are selected for concurrent Programoperations. For each ISSP, each iterative program time is about 10-20μs. The program time is controlled by another self-timed controlcircuit, which can be made in varied ways.

Step 483: This step is to discharge HV on all selected sets of64WLs+1SSL+1GSL to Vss by setting 64XTs+1SSLp+1GSLp lines to Vss and setHXD node to Vdd in accordance with the following preferred biasconditions.

-   -   a) CLA=CLR=ENS=0B,    -   b) ENB=1=Vdd,    -   c) SSLp=GSLp=XT1˜XT64=0V,    -   d) CLWL=One-shot pulse of Vdd.

Next the flow moves to Step 496, which is a first step that is similarto the Vinh precharge step for next Block A-state PGM-VFY operationshown in FIG. 7H with explanations to be seen below.

FIG. 7G is a Table listing operation conditions for B′-adjustment inEven Page according to an embodiment of the present invention. The tableshows the detailed operation sequences of bit flipping sub-steps duringthis preferred B′-bit adjustment of 4 KB Even LSB Page data. The sameTable of operation conditions can also be used as a guideline for B′-bitadjustment of 4 KB Odd LSB Page data.

As seen in FIG. 7G, the bit-flipping of LSB B′-adjustment is preferablyperformed between the 4 KB SAs and the corresponding 4 KB P/RBs withsame or different sizes in accordance with the circuit of FIG. 3. But incertain embodiments of the present invention, same size of 4 KB is usedfor both SAs and P/RBs for simpler illustration purpose without anylimiting of the scope of claims herein. The following terminologyexplanations are used in the Table shown in FIG. 7G.

-   -   I. @SA: At each SA.        -   It indicates what is each MLC cell's interim and final LSB            logic data for respective states of E, A, B1′, B2′ and C            stored at each SA.    -   II. @P/RB: At each P/RB.        -   It indicates what is each MLC cell's interim and final LSB            logic data for respective states of E, A, B1′, B2′ and C            stored at each P/RB.        -   III. “X”: It means “Don't-care” initial state that can store            either “1” or “0”.    -   IV. The Vt distribution of four final states of E, A, B and C or        5 interim states of E, A, B1′, B2′ and C are defined in FIG. 4.    -   V. Multiplier, SA and P/RB circuits are shown in FIG. 3.    -   VI. The HiNAND2 array circuit is FIG. 1A.    -   VII. Odd LSB and Even LSB share the same B′-adjustment scheme.    -   VIII. The 4 KB raw LSB page data loaded from I/Os is stored in 4        KB P/RB initially with Bit assignment before B′-adjustment as        shown below. E=1, A=0, B1′=1, B2′=1, and C=0 IX. 4 KB MSB data        of E=0, A=0, B1′=1, B2′=1, and C=1 under the Read condition of        Vtbmin=Selected WL.    -   X. Later, 4 KB MSB data is logically reversed in SA's Qi and QiB        outputs so that E=1, A=1, B1′=0, B2′=0, and C=0. This step is to        selectively flip B1′=B2′=1 in each P/RB to B1′=B2′=0 when SA's        Qi=0.

Load LSB state from I/O to CR (Step 441): This is a first step tosequentially load 4 KB Even LSB raw page data in 4K cycles from 8 I/Osto on-chip 4 KB real CACHE. The SA and P/RB are not loaded yet, thusdata stored in both SA and P/RB are at X-state.

Load and latch LSB state from CR to CACHE1sb (Step 445): This is asecond step to load in parallel in 1-cycle from 4 KB real CACHE toon-chip 4 KB pseudo Even CACHE1sb C_(LBL) capacitors with same polarity.Later, the same 4 KB Even raw LSB page data is loaded in parallel in1-cycle from 4 KB P/RB into 4 KB SAs with same polarity. Thus,@SA=10110=E,A,B1′,B2′,C. At this step, 4 KB P/RB is not loaded yet. As aresult, @P/RB=X.

Load LSB state from CR to P/RB via SA (Steps: 446˜448): This is a stepto copy 4 KB SAs' data into 4 KB P/RB directly with same polarity. As aresult, @4 KB SA=@4 KB P/RB=10110=E,A,B1′,B2′, C.

Read MSB state from cell into 4 KB SAs (Steps: 434˜435 & 449): This stepis like a MSB Read with the selected WL=Vtb′min. As a result, @4 KBSA=00111=E,A,B1’,B2′,C, because Vtb′min>Vtamax; but @4 KBP/RB=10110=E,A,B1′,B2′,C because not affected by MSB Read.

Ld/Lt LSB MSB from SA to CACHEmsb (Step: 450): This step loads andlatches Even 4 KB MSB page data back to 4 KB Even CACHEmsb with samepolarity but not affect 4 KB P/RB. As a result, @SA=00111=E,A,B1′,B2′,C,but @4 KB P/RB=10110=E,A,B1′,B2′,C.

Adjust LSB of B1′/B2′ to 0 (Step 451): This step starts adjusting B1′and B2′ bit data in accordance with the 4 Vt assignment in FIG. 4 and SAand P/RB circuits shown in FIG. 3 with MSB data stored in SA and LSB rawdata stored in P/RB. As a result, after the bit flipping logicoperation, the P/RB bit data are changed to a new value of @4 KBP/RB=10000=E,A,B1′,B2′,C.

Final LSB adjusted data Ld/Lt to CACHEcel and CACHEint (Steps: 452˜455):Finally, the desired 4 KB Even LSB B′-adjustment page data of @4 KBP/RB=10010=E,A,B1′,B2′,C to replace the @P/RB=10110=E,A,B1′,B2′,C of 4KB raw LSB Even data is obtained. Note, the steps of 4 KB Odd LSBB′-adjustment page data are identical to 4 KB Even B′-adjustment stepsand are omitted hereby for description simplicity.

Now, the following figures of FIGS. 7H, 7I, 7J, 7K, 7L, and 7M are threeProgram-Verify (PGM-VFY) flows such as A-state PGM-VFY, B-state PGM-VFYand C-state PGM-VFY. All three PGM-VFY flows are performed only afterthe completions of m pages of 8 KB MLC LSB page Program at Step 483 andm pages of 8 KB MLC MSB page program at Step 371 previously.

Three PGM-VFY operations are preferably performed in an order, startingfrom the lowest Vt of A-state, then next B-state, and lastly C-state tosave power consumption and to reduce the PGM-VFY overall latency. Due tothe preferred size reduction of 4 KB GBLs, 4 KB DB and 4 KB real CACHE,the double size of 8 KB LBLs or 8 KB cells in one physical WL, eachfull-page PGM-VFY step of A, B, and C states is divided into 2half-pages of 4 KB Even ½-page and 4 KB Odd PGM-VFY ½-page.

FIG. 7H is a flow chart showing a method for performing m-page MLC (LSBpage) A-state Program-Verify operation according to an embodiment of thepresent invention. The method includes sequential and looped steps fromStep 496 to Step 516.

Step 496: Concurrent 8 KB LBL Vinh precharging. This step continues fromStep 483 and is a self-timed step used to perform 1-cycle 8 KBconcurrent LBL Vinh precharging on both 4 KB Even and 4 KB Odd C_(LBL)capacitors in one selected CACHEcel. The self-timed control is achievedby a LBLps Differential Amplifier (DA) shown in FIG. 9C with detailedexplanation in later sections of the specification.

As seen in the HiNAND2 array FIG. 1A, each Segment has incorporated withone LBLps-DA, which is preferably enabled by Segment Flag latch when itis selected for precharging so that the LBLps-DA power consumption canbe cut off when it is not selected. During the precharging period, eachLBLps power line would be charged up from Vss to Vinh by a Vinh-Driver(not shown). One end of LBLps line is connected to one input ofLBLps-DA. Another input of LBLps-DA is connected to at least onereference Vref such as Vinh in each LBLps precharging cycle and 1.0V ineach LBLps discharging cycle.

The LBLps precharge cycle is initiated automatically by a preferredVread-DA, which detects a voltage of a Dummy Vread-WL as shown in FIG.9B. During LBLo[N] and LBLe[N] precharging step, Vread-DA's anode nodeof Vref=Vread. When each selected LBLps voltage reaching Vinh, then theEN output of Vread-DA switches from Vss to Vdd to lower voltage of PREeand PREo=1V+Vt≈1.6V from a voltage≧Vinh+Vt during the precharging cycle.The Vinh voltage at each precharging LBLps line of the selected Segmentwould be retained as the initial HV for next LBLps discharging cycle tosave power consumption. The precharging time is around 1-2 μs, dependingon total 8 KB or 4 KB C_(LBL) capacitance (C) and the resistance (R) ofeach MLBLse or MLBLso and strength of LBLps Driver. The preferred biasconditions are summarized below:

-   -   a) PREe1=PREo1=H1 and LBLps1=Vinh for precharging both 4 KB Even        CACHEcel and 4 KB Odd C_(LBL) capacitors in 1-cycle.    -   b) TIE1=0V to prevent leakage from CACHEcel to other CACHEint.    -   c) DIVen=SEGo1=SEGe1=0V for preventing the leakage of 4 KB        CLBL10 and 4 KB CLBL1e capacitors in CACHEcel to the common 4 KB        GBLs,    -   d) TIEL/2=CSL=0V (L=4) because no use.

Step 497: XT bus checking. This decision step checks if the common buslines of 64XTs+1SSLp+1GSLp are occupied by other existing concurrentoperations after Vinh-precharging step. If No, then the flow is loopedto wait until XT bus lines are free and available. If Yes, then the flowmoves to Step 498 to charge m selected sets of 64WLs+1SSL+1GSL polylines.

Step 498: Self-timed Charging & Latching of 64WLs+1SSL+1GSL withSelected WL=Vtamin condition. For performing this preferred m-page MLC(LSB page) A-state Program-Verify operation, the m selected64WLs+1SSL+1GSL poly lines have to be charged with the predeterminedvoltages. As explained before, the desired voltages of each set of64WLs+1SSL+1GSL are coupled from one common set of 64XTs+1SSLp+1GSLpwith the similarly desired voltages as indicated in the Table 4 below.

TABLE 4 one set of 64WLs + 1SSL + 1GSL voltages for m-page concurrentMLC (LSB) A-state PGM-VFY setting (HXD = Vread + Vt) and latching (HXD =0 V) 1 WL(sel) = Vtamin 1 XT(sel) = Vtamin 63 WLs(un-sel) = Vread 63XTs(un-sel) = Vread 1 SSL = Vdd 1 SSLp = Vdd 1 GSL = Vread 1 GSLp =Vread HXD with matched Pi, Qj, Sk Vread + Vt when setting but to enablelocal pump and latch Vss when latching

Again, this is another self-timed WL Vread charging step and iscontrolled by one Vread-DA, which is preferably to be independent fromthe Vpgm-DA and LBLps-DA because M-page Program and Program-Verifyoperations in different Segments of HiNAND2 array can happen at the sametime. As a consequence, Vread-DA and Vpgm-DA are preferred not to becombined into one DA. The Vread-DA circuit shown in FIG. 9B is similarto Vpgm-DA with only one difference in Vref voltage only. During thisPGM-VFY step, one reference input of this Vread DA is connect to Vreadbut when the same circuit is also used for Vread discharging detecting,then Vref switches from Vread to 1.0V.

Note, the dummy WL of Vread discharging does not need to set to Vssbecause for a normal DA operation at 1.8V Vdd, 1.0V still keeps the DAin proper operation. Vss=0V would disable the DA and cannot be used. Butdetecting the C_(LBL) capacitor discharging from Vinh (≧7V) to 1V, thewhole discharging time has realized about 90% journey. Thus the ΔV=6Vhas generated a big margin for Multiplier operation. In other words, at1V detected for C_(LBL) voltage, the final voltage of C_(LBL) would besmaller than 1V for E-cells and retained at Vinh for A-cells, B-cells,and C-cells.

Step 499: Self-timed C_(LBL) Vinh discharging or retaining step. LikeStep 435, this is a self-timed operation to perform LBL Vinh dischargingand retaining by setting Vtamin for each selected WL in accordance withthe stored states of m 8 KB MLC cells in m 8 KB selected CACHEcel.Again, this is like ABL (All-BL) Program-Verify scheme for both 4 KBEven and 4 KB Odd LSB page per one 8 KB physical MLC-WL.

The final value of each LSB C_(LBL) bit voltage is determined by eachMLC cell's state in each corresponding CACHEcel. If each MLC cell is inE-state, then each corresponding LSB C_(LBL)'s Vinh voltage will bedischarged to Vss, otherwise each C_(LBL) voltage would retain itsinitial precharged Vinh voltage with E-cell=0V in CACHEcel becauseVtamin>Vtemax≧−0.5V and TIE1=0V, and A-cell=B-cell=C-cell=Vinh inCACHEcel because Vtamin<Vta<Vtb<Vtc. The preferred bias conditions arelisted below:

-   -   a) CSL=PREe1=PREo1=LBLps1=0V        -   All 4 KB PREe1=4 KB REPe2=4 KB PREo1=4 KB PREo2=Vss. This is            to prevent C_(LBL) leakage of both 4 KB Even and 4 KB Odd            CACHEcel to one common LBLps1 line.    -   b) SEGo1=SEGe1=0V        -   This is to prevent metal1 C_(LBL) leakage of both CACHEcel            to the common metal2 4 KB GBLs    -   c) DIVen=CSL=0V,    -   d) TIE1=0V

E-cells in CACHEcel only would discharge Vinh below 1.0V or Vss butother cells would retain Vinh. Note, the C_(LBL) discharge isautomatically detected by LBLps-DA with its Vref=1.0V.

Step 500A: As explained above, the voltages of each set of64WLs+1SSL+1GSL are preferably latched on respective WL poly parasiticcapacitors simultaneously when the corresponding 8 KB C_(LBL) areperforming discharging and retaining step of 499. The reason for keepingthe voltages for each set of 64WLs+1SSL+1GSL is to save power for nextB-state or C-state Program-Verify operations performed on the same setof 64WLs+1SSL+1GSL without recharging again. As explained above, thebias conditions of WL latching are summarized below in accordance withSegment circuit shown in FIG. 2A.

-   -   a) CLWL=CLA=0,    -   b) CLR=ENS=0,    -   c) ENB=One-shot pulse of Vss.

Step 501: This decision step checks if the selected Odd Segment's flaglatch is reset in last iterative PGM-VFY step. If Yes, then the flowmoves to Step 505 in FIG. 7I to perform Even Page PGm-VFY. If No, thenthe flow moves to Step 502 to check if Even Segment's flag latch isreset in last iterative PGM-VFY step. If Yes, then the flow moves toStep 520 to do Odd-page PGM-VFY in FIG. 7I. If No, then flow moves toStep 513 to perform next B-state PGM-VFY in FIG. 7J below. The reason tocheck Segment latch is to save PGM-VFY steps. The details would beexplained below.

Step 513: This step includes 8 steps of LSB Even page of A-state PGM-VFYoperations as shown in FIG. 7I below. After this step, the followingdata are stored in respective circuits:

-   -   a) 4 KB Even A-state LSB page data are stored in 4 KB P/RB,    -   b) 4 KB Even MSB page data are stored in 4 KB CAP2,    -   c) 4 KB Even C-state LSB page data are stored in 4 KB SA,    -   d) 4 KB Even MSB page data are loaded and latched back to 4 KB        Even CACHEmsb.

Step 514: This is a self-timed Vinh precharge operation on 4 KB EvenCACHEint step with the following bias conditions:

-   -   a) PREe2=H1 and LBLps=Vinh,    -   b) PREo2=CSL=TIE1˜TIEL/2=0V (L=4),    -   c) DIVen=SEGo2=SEGe2=0V,    -   d) Others=0V.

Step 515: This is a step to load and latch 4 KB P/RB's updated LSBA-state PGM-VFY page data to 4 KB Even CACHEint with the following biasconditions:

-   -   a) SEGe3=SEGe4=0V (For CACHE1sb and CACHEmsb)    -   b) TIE1˜TIEL/2=SEGo=0V (L=4),    -   c) SEGe2=Vdd=1,    -   d) SEGe1=0V (for CACHEcel),    -   e) DIVen=BIAS=PGM=H1.

After this step, 4 KB Even LSB A-state PGM-VFY operation is completed,then next the flow moves to Step 516 to repeat the same steps for 4 KBOdd LSB A-state PGM-VFY step that comprises sub-steps from 520 to 527with details shown in FIG. 7I below. When 4 KB LSB Odd page of PGM-VFYoperation is completed, then the flow moves to Step 530.

FIG. 7I is a flow chart showing a method for performing m-page MLC (LSBpage) A-state Program-Verify (PGM-VFY) operation according to anembodiment of the present invention. This flow is divided into twosub-flows: I) 4 KB Even LSB Page A-state PGM-VFY flow from Step 505 toStep 512 and II) 4 KB Odd LSB Page A-state PGM-VFY flow from Step 520 toStep 527.

Step 505: This step is to sense and amplify the last (old) iterative 4KB A-state analog Even page data under Vtamin condition stored in EvenCACHEint to 4 KB SA via 4 KB metal2 GBLs as explained before with thefollowing conditions:

-   -   a) DIVen=SEGe2=H1 (in CACHEint),    -   b) SEGe1=SEGe3=SEGe4=0V (in CACHE1sb, CACHEmsb, and CACHEcel),    -   c) CSL=0V,    -   d) TIE1˜TIEL/2=SEGo=0V (L=4),    -   e) SEGo2=0V (for CACHEcel),    -   f) DIVen=BIAS=PGM=H1.

Step 506: This step is to transfer the 4 KB SA Even LSB A-state PGM-VFYdata to 4 KB P/RB with same polarity with the following conditions:

-   -   a) INV=IDB=IDAB=0V,    -   b) ENMLSB=PGM=0V,    -   c) IDC=WBK=One-shot pulse of Vdd,    -   d) T5=1.

Step 507: This step is to sense and amplify the stored 4 KB MSB analogpage data in Even CACHEmsb to 4 KB SA via 4 KB metal2 GBLs as explainedbefore with the following conditions:

-   -   a) TIE1˜TIEL/2=0V (L=4),    -   b) DIVen=SEGe4=H1 (in CACHEmsb),    -   c) SEGe1=SEGe2=SEGe3=0V (in CACHEcel, CACHEint, and CACHE1sb),    -   d) CSL=0V,    -   e) SEGo4=0V,    -   f) T5=1, Voutp=Vref+/−AV.

Step 508: This step is to transfer the 4 KB SA Even LSB B-state MSB datato 4 KB CAP2 with same polarity. The values of 4 KB MLSB=1/0 in 4 KBCAP2 if Qi=1/0 in 4 KB SAs. The preferred bias conditions are listedbelow:

-   -   a) INV=ENSB2=0V,    -   b) IDAB=0V,    -   c) PGM=IDB=IDC=WBK=0V,    -   d) ENSB1=One shot H1,        -   This condition connects and latches SA's Qi to CAP2 (see            FIG. 3) when the NMOS transistor 12 is fully turned on by            ENSB1=H1. MSB bit data is stored in CAP2 temporarily.    -   e) T5=1.

Step 509: It is like Step 514. This is a similar self-timed Vinhprecharge operation on 4 KB Even CACHEmsb step with the following biasconditions:

-   -   a) PREe4=H1 and LBLps4=Vinh    -   b) PREe1=PREe2=PREe3=CSL=TIE1˜TIEL/2=0V (L=4),    -   c) DIVen=SEGo=SEGe=0V.

Step 510: This is a step to load and latch 4 KB P/RB's LSB B-state (MSBdata) PGM-VFY page data to 4 KB Even CACHEmsb with the following biasconditions:

-   -   a) SEGe1=SEGe2=SEGe3=0V (For CACHE1sb, CACHEcel, CACHEint)    -   b) TIE1˜TIEL/2=SEGo=0V (L=4),    -   c) SEGe4=Vdd=1,    -   d) DIVen=BIAS=WRT=One-shot of H1.

Step 511: This step is to sense and amplify the updated (new) iterative4 KB A-state Even analog page data under Vtamin condition in EvenCACHEcel to 4 KB SA via 4 KB metal2 GBLs as explained before with thefollowing conditions:

-   -   a) SEGe2=SEGe3=SEGe4=0V (in CACHEint, CACHE1sb, and CACHEmsb),    -   b) CSL=0V,    -   c) TIE1˜TIEL/2=SEGo=0V (L=4),    -   d) SEGo1=0V (for CACHEcel),    -   e) DIVen=SEGe1=H1 (CACHEcel).

Step 512: This step is to do A-state bit flipping for 4 KB Even LSBiterative A-state PGM-VFY operation in accordance with 4 KB last (old)Even A-state page data in 4 KB P/RB, 4 KB Even MSB page data temporarilystored in 4 KB CAP2, and updated (new) 4 KB Even A-state page datastored in 4 KB SAs. As a results, each P/RB Di=0 will be flipped to “1”when SA Qi=1 and CAP2=1=MLSB as performed in Step 508. The preferredbias conditions are listed below:

-   -   a) INV=IDB=IDC=0V,    -   b) ENMLSB=PGM=WBK=0V,    -   c) IDAB=One shot Vdd,    -   d) T5=1.

After this step, a 4 KB Even LSB page of A-state iterative PGM-VFYoperation is completed and the flow moves to 530.

Step 520 to Step 527: Another sub-flow of 4 KB Odd LSB Page A-statePGM-VFY operation. Basically, these steps similar to those Steps 505-512for 4 KB Even LSB A-state iterative PGM-VFY flow. Thus the details areidentical and omitted hereby for the description simplicity for thoseare skilled in the art.

FIG. 7J is a flow chart showing a method for performing m-page MLC (LSBpage) B-state Program-Verify operation according to an embodiment of thepresent invention. The method includes several steps from Step 530 toStep 542.

Step 530: Concurrent 8 KB LBL Vinh precharging. This step is like Step496 a Self-timed step for performing 1-cycle 8 KB concurrent LBL Vinhprecharge on both 4 KB Even and 4 KB Odd C_(LBL) capacitors in oneselected CACHEcel with identical bias condition repeated below:

-   -   a) PREe1=PREo1=H1 and LBLps1=Vinh for precharging both 4 KB Even        CACHEcel and 4 KB Odd C_(LBL) capacitors in 1-cycle,    -   b) TIE1=0V to prevent leakage from CACHEcel to other CACHEint,    -   c) DIVen=SEGo1=SEGe1=0V for preventing the leakage of 4 KB        C_(LBL)1e and 4 KB C_(LBL)1e capacitors in CACHEcel to the        common 4 KB GBLs,    -   d) TIEL/2=CSL=0V (L=4) because no use.

Step 531: XT bus checking. This decision step is like Step 497 to checkif the common bus lines of 64XTs+1SSLp+1GSLp are occupied by otherexisting concurrent operations after Vinh-precharging step. If No, thenthe flow is looped to wait until XT bus lines are free and available. IfYes, then the flow moves to Step 532 to charge m selected sets of64WLs+1SSL+1GSL poly lines.

Step 532: Self-timed Charging & Latching of 64WLs+1SSL+1GSL withselected WL=Vtamin condition. This step is like Step 498, a self-timedWL Vread charging step and is controlled by the same Vread-DA used forStep 498. The only difference between Step 532 and Step 498 is that theselected WL voltage is changed from Vtamin to Vtbmin. The rest ofvoltages of other poly lines are kept unchanged as shown below in Table5.

Table 5: one set of 64WLs+1SSL+1GSL voltages for m-page concurrent MLC(LSB) B-state PGM-VFY setting (HXD=Vread+Vt) and latching (HXD=0V)

TABLE 5 one set of 64 WLs + 1SSL + 1GSL voltages for m-page concurrentMLC (LSB) B-state PGM-VFY setting (HXD = Vread + Vt) and latching (HXD =0V) 1 WL (sel) =Vtbmin 1 XT (sel) = Vtbmin 63 WLs (un-sel) = Vread 63XTs (un-sel) = Vread 1 SSL = Vdd 1 SSLp = Vdd 1 GSL = Vread 1 GSLp =Vread HXD with matched Pi, Qj, Sk Vread + Vt when setting to enablelocal pump and latch but Vss when latching

Step 533: Self-timed C_(LBL) Vinh discharging or retaining step. LikeStep 499, this is a self-timed operation to perform C_(LBL) Vinhdischarging and retaining by setting Vtbmin for each selected WL inaccordance with the stored states of m 8 KB MLC cells in m 8 KB selectedCACHEcel. Again, this is like ABL (All-BL) Program-Verify scheme forboth 4 KB Even and 4 KB Odd LSB page per one 8 KB physical MLC-WL. Thefinal value of each LSB C_(LBL) bit voltage under Vtbmin for selected WLis determined by each MLC cell's state in each corresponding CACHEcel.If each MLC cell is in E-state, then each corresponding LSB C_(LBL)'sVinh voltage will be discharged to Vss, otherwise each C_(LBL) voltagewould retain its initial Vinh value with following conditions: 1)E-cell=A-cell=0V in CACHEcel, because Vtbmin>Vtamax>Vtemax≧−0.5V andTIE1=0V; 2) B-cell=C-cell=Vinh in CACHEcel, becauseVtemax<Vtamax<Vtbmin<Vtbmin. The preferred bias conditions are identicalto Step 499 as repeated below.

-   -   a) CSL=PREe1=PREo1=LBLps1=0V        -   It means all 4 KB PREe1=4 KB REPe2=4 KB PREo1=4 KB            PREo2=Vss. This is to prevent C_(LBL) leakage of both 4 KB            Even and 4 KB Odd CACHEcel to one common LBLps1 line.    -   b) SEGo1=SEGe1=0V        -   This is to prevent metal1 C_(LBL) leakage of both CACHEcel            to the common 4 KB metal2 GBLs    -   c) DIVen=CSL=0V,    -   d) TIE1=0V        -   E-cells in CACHEcel only would discharge Vinh below 1.0V or            Vss but other cells would retain Vinh. Note, the C_(LBL)            discharge is automatically detected by a LBLps-DA with its            Vref=1.0V.

Step 534: Like Step 500A, the voltages of each set of 64WLs+1SSL+1GSLare preferably latched on respective WL parasitic poly line capacitorssimultaneously when the corresponding 8 KB C_(LBL) capacitors performdischarging and retaining step of 533.

As explained above, the bias conditions of WL latching are summarizedbelow as Step 500A in accordance with Segment circuit shown in FIG. 2A.

-   -   a) CLWL=CLA=0,    -   b) CLR=ENS=0,    -   c) ENB=One shot of Vss.

Step 535: Is Xso reset? This decision step checks if the selected OddSegment's flag latch is reset in last iterative PGM-VFY step. If yes,then the flow moves to Step 560 in FIG. 7K below to perform 4 KB OddPage PGM-VFY operation. If No, then the flow moves to Step 536 to checkif Even Segment's flag latch is reset in last iterative PGM-VFY step. Ifyes, then the flow moves to Step 543 to do Even-page PGM-VFY operationin FIG. 7K. If No, then flow moves to Step 573.

The reason to in Step 560 to perform Odd, rather than Even page PGM-VFYstep is to save steps. For example, if the last iterative PGM-VFY stepis to perform on Even page first and Odd page later, then for nextiterative PGM-VFY step, it is preferably to start Odd page first andthen Even page. In this manner, then the steps of page data transferringbetween pseudo CACHE, SA and P/RB can be reduced.

Step 537: This step includes several 6 sub-steps (560-565) of LSB Oddpage of B-state PGM-VFY operation as shown in FIG. 7K below. After thisstep, the following data are stored in respective circuits:

-   -   a) CAP2: To store 4 KB LSB raw page data (step 561)        -   4 KB Odd LSB raw page data are stored in 4 KB CAP2 from 4 KB            Odd CACHE1sb. After CAP3 latching, then same 4 KB Odd LSB            raw page data are loaded and latched back to the original 4            KB Odd CACHE1sb (step 563).    -   b) SA: To store 4 KB updated (new) B-state Odd PGM-VFY page data        (step 564)        -   This step is to read back 4 KB B-state Odd PGM-VFY page data            from 4 KB Odd CACHEcel under Selected WL=Vtbmin.    -   c) P/RB: Store last (old) page data and updated (new) 4 KB Odd        LSB PGM-VFY page data.

Step 538: This is a self-timed Vinh precharge operation on 4 KB OddCACHEint step with the following bias conditions:

-   -   a) PREe2=PREo2=H1 and LBLps1=Vinh,    -   b) CSL=TIE1˜TIEL/2=0V (L=4),    -   c) DIVen=SEGo2=SEGe2=0V,    -   d) Others=0V.

Step 539: This is a step to load and latch 4 KB P/RB's updated LSBB-state PGM_VFY page data to 4 KB Odd CACHEint with the following biasconditions:

-   -   a) SEGe3=SEGe4=0V (For CACHE1sb and CACHEmsb),    -   b) TIE1˜TIEL/2=SEGo=0V (L=4),    -   c) SEGo2=Vdd=1,    -   d) SEGo1=0V (for CACHEcel),    -   e) DIVen=BIAS=PGM=H1 one shot pulse.

Step 540: This step switches from lastly completed Odd LSB page to startthe 4 KB Even LSB B-state PGM-VFY operation. Like Step 505 to read EvenLSB page, this step is to read out 4 KB Even LSB B-state results from 4KB Even CACHEint to 4 KB SAs with same polarity but under Vtbmin on eachselected WL: a) SA Qi=0 if CACHEint=Vss for both E-cell and A-cell, b)SA Qi=1, if CACHEint=Vinh for both B-cell and C-cell.

Step 541: This step is like Step 506 but switches to transfer the 4 KBSA Even LSB A-state PGM-VFY data to 4 KB P/RB with same polarity withthe following conditions:

-   -   a) INV=IDB=IDAB=0V,    -   b) ENMLSB=PGM=0V,    -   c) IDC=WBK=One-shot pulse of Vdd and T5=1.

Step 542: This step is to perform the duplicated 6 sub-steps of 545-550,which jointly perform the 4 KB Even LSB B-state PGM-VFYsteps. Thedetails are explained below in FIG. 7K.

FIG. 7K is a flow chart showing a method for performing m-page MLC (LSBpage) B-state Program-Verify operation according to an embodiment of thepresent invention. Since this is a B-state PGM-VFY operation, the totalnumber of steps for Even and Odd pages are less than the steps forA-state PGM-VFY operation because no need to Read A-state again. Asopposite to A-state LSB PGM-VFY, this B-state PGM-VFY operationpreferably starts from the 4 KB LSB Odd page first because last step ofA-state PGM-VFY operation just ends with LSB Odd page. As such, thus afew steps can be saved if B-state iterative PGM-VFY operation startsfrom 4 KB LSB Odd page.

From Step 545 to Step 550, a 4 KB LSB Even Page B-state PGM-VFYoperation is performed. Step 545: Like Step 505, this step is to senseand amplify each 4 KB raw LSB Even-page data stored in Even CACHE1sb to4 KB SA via 4 KB metal2 GBLs as explained before with the followingconditions:

-   -   a) DIVen=SEGe3=H1 (in CACHE1sb),    -   b) SEGe1=SEGe2=SEGe4=0V (in CACHEcel, CACHEint, and CACHEmsb),    -   c) CSL=SEGo3=0V,    -   d) TIE1˜TIEL/2=SEGo=0V (L=4),    -   e) Voutp=Vref+/−ΔV,    -   f) T5=One-shot pulse of Vss.

Step 546: This step is to transfer the 4 KB SA Even LSB raw data to 4 KBCAP2 with same polarity such as MLSB=1/0 if Qi=1/0. The preferred biasconditions are summarized below:

-   -   a) INV=0V,    -   b) IDAB=IDB=IDC=WBK=0V,    -   c) ENSB2=PGM=0V,    -   d) ENSB1=One shot pulse of Vdd,    -   e) T5=1.

Step 547: This is a similar self-timed Vinh precharge operation on 4 KBEven CACHE1sb step. The reason to do this step is to prepare the savingof the 4 KB Even raw LSB page data back to 4 KB Even CACHE1sb becauseafter Step 545, the stored 4 KB LSB raw data is corrupted due to chargesharing between each LBL and each GBL. The preferred bias conditions aresummarized below:

-   -   a) PREe4=H1 and LBLps4=Vinh,    -   b) PREe1=PREe2=PREe3=CSL=TIE1˜TIEL/2=0V (L=4),    -   c) DIVen=SEGo=SEGe=0V.

Step 548: This is a step to load and latch 4 KB Even LSB raw page datato 4 KB Even CACHE1sb with the following bias conditions:

-   -   a) SEGe1=SEGe2=SEGe4=0V (For CACHE1sb, CACHEcel, CACHEint),    -   b) TIE1˜TIEL/2=SEGo=0V (L=4),    -   c) SEGe3=Vdd=1,    -   d) DIVen=BIAS=WRT=One-shot pulse of H1.

Step 549: This step is to sense and amplify the updated (new) iterative4 KB B-state Even analog page data under Vtbmin condition in EvenCACHEcel to 4 KB SA via 4 KB metal2 GBLs as explained before with thefollowing conditions:

-   -   a) SEGe2=SEGe3=SEGe4=0V (in CACHEint, CACHE1sb and CACHEmsb),    -   b) CSL=0V,    -   c) TIE1˜TIEL/2=SEGo1=0V (for CACHEcel) (L=4),    -   d) DIVen=SEGe1=H1 (CACHEcel).

Step 550: This step is to do B-state bit flipping for 4 KB Even LSB pageof iterative B-state PGM-VFY operation in accordance with 4 KB last(old) Even B-state page data in 4 KB P/RB, 4 KB Even LSB page datatemporarily stored in 4 KB CAP2, and updated (new) 4 KB Even A-statepage data stored in 4 KB SAs. As a results, each P/RB Di=0 will beflipped to “1” when SA Qi=1 and CAP2=1=MLSB. The preferred biasconditions are listed below:

-   -   a) INV=IDB=IDC=0V,    -   b) ENMLSB=PGM=WBK=0V,    -   c) IDAB=One shot Vdd and T5=1.

After this step, then an updated 4 KB Even LSB page data of B-stateiterative PGM-VFY operation is completed and the flow moves to 580.

From Step 560 to Step 565, a 4 KB Odd LSB Page B-state PGM-VFY operationis performed. Basically, these steps are repeated for 4 KB Odd LSBB-state iterative PGM-VFY flow which is similar to Step 505 to Step 512for 4 KB Odd LSB B-state iterative PGM-VFY flow. Thus the details areidentical and omitted hereby for the description simplicity for thoseare skilled in the art.

FIG. 7L is a flow chart of a method for performing m-page MLC (LSB page)C-state Program-Verify operation according to an embodiment of thepresent invention. Since this is a C-state PGM-VFY operation, the totalnumber of steps of Even and Odd pages are further reduced as compared toA-state and B-state PGM-VFY operations because no need to Read A-stateand B-state again. As opposite to B-state LSB PGM-VFY operation, thisC-state PGM-VFY operation preferably starts from the 4 KB Even LSB pagefirst because last step of B-state PGM VFY just ends with an Even LSBpage. As such, thus a few steps can be saved if C-state iterativePGM-VFY operation continues from the B-state PGM-VFY flow.

Step 580: Concurrent 8 KB C_(LBL) Vinh precharging. This step is likeStep 496 a self-timed step for performing 1-cycle 8 KB concurrentC_(LBL) Vinh precharge on both 4 KB Even and 4 KB Odd C_(LBL) capacitorsin one selected CACHEcel with identical bias condition repeated below:

-   -   a) PREe1=PREo1=H1 and LBLps1=Vinh for precharging both 4 KB Even        CACHEcel and 4 KB Odd C_(LBL) capacitors in 1-cycle,    -   b) TIE1=0V to prevent leakage from CACHEcel to other CACHEint.    -   c) DIVen=SEGo1=SEGe1=0V for preventing the leakage to 4 KB GBLs.    -   d) TIEL/2=CSL=0V (L=4) because no use.

Step 581: XT bus checking. This decision step is like Step 531 to checkif the common bus lines of 64XTs+1SSLp+1GSLp are occupied by otherexisting concurrent operations after Vinh-precharging step. If No, thenthe flow is looped to wait until XT bus lines are free and available. IfYes, then the flow moves to Step 582 to charge m selected sets of64WLs+1SSL+1GSL poly lines.

Step 582: Self-timed Charging & Latching of 64WLs+1SSL+1GSL withselected WL=Vtcmin condition. This step is like Step 532, a self-timedWL Vread charging step and is controlled by the same Vread-DA used forStep 532. The only difference between Step 582 and Step 498 is that theselected WL voltage is changed from Vtbmin to Vtcmin. The rest ofvoltages of other poly lines are kept unchanged as shown below in Table6.

TABLE 6 one set of 64WLs + 1SSL + 1GSL voltages for m-page concurrentMLC (LSB) C-state PGM-VFY setting (HXD = Vread + Vt) and latching (HXD =0 V) 1 WL(sel) = Vtcmin 1 XT(sel) = Vtcmin 63 WLs(un-sel) = Vread 63XTs(un-sel) = Vread 1 SSL = Vdd 1 SSLp = Vdd 1 GSL = Vread 1 GSLp =Vread HXD with matched Pi, Qj, Sk to Vread + Vt when setting but Vssenable local pump and latch when latching

Step 583: Self-timed C_(LBL) Vinh discharging or retaining step. LikeStep 499, this is a self-timed operation to perform C_(LBL) Vinhdischarging and retaining by setting Vtcmin for each selected WL inaccordance with the stored states of m 8 KB MLC cells in m 8 KB selectedCACHEcel. Again, this is like ABL (All-BL) Program-Verify scheme forboth 4 KB Even and 4 KB Odd LSB page per one 8 KB physical MLC-WL. Thefinal value of each LSB C_(LBL) bit voltage under Vtcmin for selected WLis determined by each MLC cell's state in each corresponding CACHEcel.If each MLC cell is in E-state, then each corresponding LSB C_(LBL)'sVinh voltage will be discharged to Vss, otherwise each C_(LBL) voltagewould retain its initial Vinh value with following conditions: 1)E-cell=A-cell=B-cell=0V in CACHEcel, becauseVtcmin>Vtbmin>Vtamax>Vtemax≧−0.5V and TIE1=0V; 2) C-cell=Vinh inCACHEcel, because Vtemax<Vtamax<Vtbmin<Vtcmin. The preferred biasconditions are listed below:

-   -   a) CSL=PREe1=PREo1=LBLps1=0V        -   It means all 4 KB PREe1=4 KB REPe2=4 KB PREo1=4 KB            PREo2=Vss. This is to prevent C_(LBL) leakage of both 4 KB            Even and 4 KB Odd CACHEcel to one common LBLps1 line.    -   b) SEGo1=SEGe1=0V        -   This is to prevent metal1 C_(LBL) leakage of both CACHEcel            to the common metal2 4 KB GBLs.    -   c) DIVen=CSL=0V,    -   d) TIE1=0V.        -   E-cells in CACHEcel will discharge Vinh below 1.0V or Vss            but other cells would retain Vinh. Note, the C_(LBL)            discharge is automatically detected by a LBLps-DA with its            Vref=1.0V.

Step 584: Unlike previous A-state and B-state PGM-VFY steps, thevoltages of each set of 64WLs+1SSL+1GSL are preferably latched onrespective WL parasitic poly line capacitors. Since C-state PGM-VFYoperation is a last MLC PGM-VFY step, thus no need to save them.Thereby, after Step 583, this Step 584 performs the discharging of64WLs+1SSL+1GSL to reduce Vread HV stress on cells for longevity of P/Ecycles in accordance with the following bias conditions.

-   -   a) LA=CLR=ENS=0V,    -   b) ENB=1=Vdd,    -   c) XT1-XT64=SSLp=GSLp=0V,    -   d) CLWL=one-shot pulse of Vdd.

Step 585: Is XSo reset? This decision step checks if the selected OddSegment's flag latch is reset in last iterative PGM-VFY step. If yes,then the flow moves to Step 605 in FIG. 7M below to perform 4 KB OddPage PGM-VFY operation. If No, then the flow moves to Step 586 to checkif Even Segment's flag latch is reset in last iterative PGm-VFY step. IfYes, then the flow moves to Step 620 to do Odd-page PGM-VFY operation inFIG. 7K above. If No, then flow moves to next Step 587.

Step 587: This step duplicates several sub-steps of 605 to 614 toperform 4 KB Even LSB page of C-state PGM-VFY operation. The detailswill be explained in FIG. 7M below.

Step 588: This step is to sense and amplify 4 KB last C-state Evenanalog page data under Vtcmin condition in Even CACHEint to 4 KB SA via4 KB metal2 GBLs as explained before with the following condition:

-   -   a) SEGe1=SEGe3=SEGe4=0V (in CACHEcel, CACHE1sb and CACHEmsb),    -   b) CSL=0V,    -   c) TIE1˜TIEL/2=SEGo1=0V (for CACHEcel) (L=4),    -   d) DIVen=SEGe2=H1 (CACHEint).

Step 589: This step is to transfer 4 KB SAs to 4 KB P/RB with the samepolarity such as each P/RB Di=1/0 in accordance with each SA Qi=1/0. Thepreferred bias conditions are listed below:

-   -   a) INV=IDB=IDAB=0V,    -   b) ENMLSB=PGM=0V,    -   c) IDC=WBK=One shot Vdd and T5=1.

Step 590: As opposite to Step 587, this step duplicates severalsub-steps of 620 to 629 to perform 4 KB Old LSB page of C-state PGM-VFYoperation. The details will be explained in FIG. 7M below.

Step 591: This is a decision step to check if the last page of iterativeMLC LSB of C-state PGM-VFY operation is completed. If Yes, then Step 592of Nmax increases by N=N+1 at Step 593 and moves to Step 480 if Nmax notbeing reached yet. If Nmax is reached, then flow moves to Step 594reporting a defective page to end next MLC Program and Program-Verify.If No to Step 591, it means some LSB pages are not finished program yet.Then the flow moves to Step 480 in FIG. 7F to continue the MLC LSBProgram and Program-Verify operation.

FIG. 7M is a flow chart showing a method for performing m-page MLC (LSBpage) C-state Program-Verify operation according to an embodiment of thepresent invention. Since this is a C-state PGM-VFY operation, the totalnumber of steps of Even and Odd pages is less than the steps of B-statePGM-VFY because no need to Read A- and B-states again. As opposite toB-state LSB PGM-VFY, this C-state PGM-VFY operation preferably startsfrom the 4 KB Even LSB page first because last step of B-state PGM-VFYoperation just ends with an Even LSB page. Thus a few steps can be savedif C-state iterative PGM-VFY operation starts from 4 KB Even LSB page.

Steps 605-614 are for performing 4 KB Even LSB Page C-state PGM-VFYoperation. Step 605: Like Step 505, this step is to sense and amplifyeach 4 KB raw LSB Even-page data stored in Even CACHEcel to 4 KB SA via4 KB metal2 GBLs as explained before with the following conditions:

-   -   a) DIVen=SEGe1=H1 (in CACHEcel),    -   b) SEGe2=SEGe3=SEGe4=0V (in CACHEint, CACHE1 as and CACHEmsb),    -   c) CSL=SEGo1=0V,    -   d) TIE1˜TIEL/2=0V (L=4),    -   e) Voutp=Vref+/−ΔV,    -   f) T5=One-shot pulse of Vss.

Step 606: This step mainly is to flip the 4 KB PR/B bit value by 4 KBSAs that stores the 4 KB updated (new) C-state PGM-VFY data inaccordance with the DB circuit shown in FIG. 3. The preferred biasconditions are summarized below:

-   -   a) INV=IDB=IDAB=0V,    -   b) ENMLSB=PGM=WBK=0V,    -   c) IDC=One shot Vdd and T5=1,        -   As a results, each P/RB bit of Di=0 will be flipped to “1”            when each corresponding bit of SA Qi=1.            Thus after this step, the bit flipping for C-state PGM-VFY            updated data is obtained.

Step 607: This is a decision step used to check if all m selected randomEven LSB pages pass MLC PGM-VFY of all A, B and C states. This can bedone automatically by the PGM-VFY Check circuit of 108 as shown in FIG.3. As explained before, when A-state, B-state and C-state PGm-CFY pass,then all 4 KB P/RB DiB=0V under PCHK=Vdd condition. Then the OR-line ofPASS will become Vdd because ENB=0V. That will increase PASS counter byone. If Yes, then flow moves to Step 608 to reset XSe (Even SegmentLatch) flag latches.

Step 609: This decision step checks if XSe flag latch is reset due thepass of PGM-VFY operations of the selected pages. If yes, then the nextstep is to reset the XD flag latches.

Step 611: This decision step checks if all XD flag latches of selectedSegments are being reset due the pass of all PGM-CFY of the selectedpages. If Yes, then m-page MLC Program and PGM-VFY operations arefinished successfully and end at Step 612. If No, then the m-page MLCProgram and PGM-CFY operations have to be continued for the unfinishedpages. The flow moves to Step 613.

Step 613: Concurrent 4 KB LBL Vinh precharging. Like Step 496 this stepis a self-timed step for performing 1-cycle 4 KB concurrent LBL Vinhprecharge only on both selected 4 KB Even CACHEcel and 4 KB EvenCACHEint with identical bias condition repeated below:

-   -   a) PREe1=H1 and LBLps1=Vinh but PREo1=Vss for precharging 4 KB        Even CACHEcel in 1-cycle,    -   b) PREe2=H1 and LBLps2=Vinh but PREo2=Vss for precharging 4 KB        Even

CACHEint in 1-cycle,

-   -   c) TIE1˜TIEL/2=0V (L=4) to prevent leakage from CACHEcel to        other CACHEint.    -   d) DIVen=SEGo1=SEGe1=0V for preventing the leakage to 4 KB GBL.

e) Others=0V because no use.

The reason not to precharge the 4 KB Odd CACHEcel is for keeping 4 KBOdd C_(LBL) capacitors unchanged and independent from 4 KB Even C_(LBL)capacitors. In some case, either Even or Odd LSB pages may not be allpassed. Performing the Even C_(LBL) precharge only would not destroy thelast stored Odd data or vice versa.

Step 614: Concurrent 4 KB LBL Vinh precharging. This is a step to loadand latch 4 KB Even C-state PGM-VFY page data from 4 KB P/RB to 4 KBeven CACHEcel and CACHEint simultaneously in 1-cycle in accordance withthe following bias conditions:

-   -   a) SEGe1=Vdd (CACHEcel),    -   b) SEGe2=Vdd (CACHEint),    -   c) SEGe3=0V (CACHE1sb),    -   d) SEGe4=0V (CACHEmsb),    -   e) TIE1˜TIEL/2=0V (L=4),    -   f) DIVen=BIAS=PGM=One-shot H1.        Then the flow moves to Step 591 to check if all 4 KB Even LSB        pages of C-state PGM-VFY operations are completed as seen in        FIG. 7L.

Steps 620-629 are for performing 4 KB Odd LSB Page C-state PGM-VFYoperation. This flow is substantially similar as the Even LSB pagePGM-VFY operation. Thus the details are omitted herein for descriptionsimplicity.

As Even LSB pages, then the flow moves to Step 591 to check if all 4 KBOdd LSB pages of C-state PGM-VFY operations are independently completedas seen in FIG. 7L.

FIG. 8A is a diagram showing NAND circuit structures and a method forperforming m-page SLC Read operation according to an embodiment of thepresent invention. Basically, when a MLC cell stores only 2-state (E andB′-state) or 2 Vts per one physical MLC cell, it is like a SLC cellwhich also stores 2 states and 2 Vts such as E-state and B′-state. Theplurality of MLC cells is stored in each MLC-WL, while the plurality ofSLC cells is stored in SLC-WL as defined in previous pages. In eachMLC-WL, there is one extra bit of NAND cell called as a Flag cell thatstores either “1” the erased bit data or “0” the programmed bit data ina separate BL physically. As defined earlier, when Flag cell=1, it meansonly 2-state of MSB bit is stored in each MLC physical cell. When Flagcell=0, it means a full 4-state of both MSB and LSB bits are stored ineach MLC physical cell.

Therefore, when Flag=1, both SLC Read operation and MLC's MSB Readoperation are to deal with the same storage of 2-Vt or 2-state per onephysical NAND cell Read. Thus the m-page Read methodology of SLC and MLC(MSB-bit only) are treated as same below. In an embodiment, thispreferred methodology includes four consecutive steps denoted as 1, 2,3e, 3o in FIG. 8A and three basic operations as explained below.

1): Vinh precharge on 8 KB CACHEcel C_(LBL) capacitors. Since this is asimple 2-Vt Read operation, no need of any bit-flipping logic operationis required. Thus no need of any extra 8 KB CACHEs for storing thetemporary page data. As a result, only one 8 KB pseudo CACHEcel isrequired.

2): Vinh discharging and retaining on 8 KB CACHEcel C_(LBL) capacitors.The step is each SLC cell's state development stage of m selectedSLC-WLs during m-page SLC Read operation, regardless of a single randompage or m pages of a selected Block. In a SLC Read, the Vinh dischargein CACHEcel happens to those SLC cells storing E-state (Bit data=0) withVtemax<Vtamin. Those B′-state (Bit-data=1) would retain Vinh prechargedvoltage because Vtb′min>Vtamin on m selected WLs.

3): Charge-sharing (CS). The sensing or readout of each SLC cell, it ismeant to read each NAND cell's analog voltage data stored in eachcorresponding C_(LBL) capacitor that stores Vinh for a B′-state cell orVss for a E-state cell with Vinh precharging. When reading one cell ineach CACHEcel's metal1 LBL to each SA through each metal2 GBL line, acharge-sharing between each C_(LBL) capacitor and each C_(GBL) capacitorwill happen. As a result, the readout cell voltage in each C_(LBL)signal will be diluted and thus reduced. And that is why a Multiplier isincorporated in each GBL to do a first analog cell signal amplificationlike a DRAM-cell and then a second analog amplification at SA next. Now,the steps of m-page SLC Read methodology will be explained below byreferring to FIG. 8A.

Arrow1: This is referred to a first step for performing the m-page SLCRead operation. In prior-art 1-level BL NAND array scheme, a 1.0Vprecharge voltage on all selected long metal1 GBLs from DB is requiredbefore starting any one full-page or one partial-page SLC Readoperation. Unlike prior art 1-level BL NAND array scheme using higherpower-consuming GBL for precharging, the present 2-level BL-HierarchicalHiNAND2 array uses short metal1 LBL line with less-power for prechargingVinh voltage from selected LBLps lines with the preferred biasconditions set below with reference to FIG. 1A for one 8 KB CACHEcelonly:

-   -   a) PREo1=PREe1=H1 along with LBLps1=V2=Vinh to precharge both 4        KB Even and Odd CACHEcel C_(LBL) capacitors,    -   b) TIE1=Vss: To shut off MLBLb transistor.        -   This condition is to disconnect CACHEcel from CACHEint to            prevent Vinh leakage.    -   c) SEGo1=SEGe1=Vss:        -   These conditions are to shut off both 4 KB Even MLBLpe and 4            KB Odd MLBLpo GBL-divided transistors of 8 KB CACHEcel            registers so that no leakage would happen from 8 KB CACHEcel            to the common 4 KB metal2 GBLs.    -   d) Other SEGo2=SEGe2=SEGo3=SEGe3=SEGo4=SEGe4=0V and TIE2=0V,        PREo2=PREe2=PREo3=PREe3=PREo4=PREe4 because CACHEint, CACHE1sb        and CACHEmsb are not selected for this SLC Read operation.

Arrow2: This is referred to a second step to read out m pages of 8 KB2-Vt SLC data stored in 8 KB cells in m CACHEcels. Basically, a SLC Readis a VR1 discharging step, i.e., each selected WL in each CACHEcelSegment is coupled with VR1=Vtamin along with the remaining 63 WLs=Vreadand 1SSL=Vdd and 1GSL=Vread.

When a SLC cell's Vt<Vtamin, then C_(LBL)=Vinh=0V, i.e., Vinh isdischarged. This is an E-state cell with Logic bit=1. When a SLC cell'sVt>Vtamin, then C_(LBL)=Vinh, i.e., Vinh voltage is retained. This is aB′-state cell with Logic bit=0.

Note, this is a self-timed step by using a LBLps Detector DA connectedto one end of LBLps1 (not shown). In a real design, the LBL's VinhDetector is not necessary to detect Vinh being dropped to Vss. Instead,it is designed to detect LBLps=Vinh≈2.0V with a quick response so thatVinh Detector can be disabled earlier to prevent the leakage betweenVinh-retaining C_(LBL) capacitors and discharging C_(LBL) capacitors byshutting off both MLBLso and MLBLse transistors before C_(LBL) linevoltage dropped to Vss. In conclusion, after the step (indicated byArrow2), two C_(LBL) analog voltages are temporarily stored in eachC_(LBL) capacitor such as CACHEcel=Vinh (B′-cells) and CACHEcel=Vss(E-cells). For m-page Block SLC concurrent Read, total m 8 KB CACHEcelC_(LBL) capacitors would store m 8 KB SLC Read page data with a voltagepatterns of Vinh or Vss in accordance with m SLC cells' stored states,B′-state or E-state.

Arrow3 (3e and 3o): The 3e Step is to perform 2 cycles of C_(LBL) analogvoltage sensing and amplification from 4 KB Even CACHEcel by one 4 KBMultiplier and one 4 KB SA. The first cycle is to sense, transfer, andamplify the 4 KB Even SLC analog data by the Multiplier (102 of FIG. 3),and then the second cycle is to perform further amplification to a fulldigital bit data by the SA (104 of FIG. 3). The sensing and transferringoperation is performed between each metal1 LBL and each metal2 GBL.Therefore, the 4 KB divided GBL paired transistors of 4 KB MLBLpe and 4KB MLBLPo of 8 KB selected CACHEcel and CACHEint have to be turned onone by one basis in accordance the location of selected Group due to thelimitation of 4 KB GBLs for area saving.

Referring to FIG. 8A, the flow shows 4 KB Even C_(LBL) capacitors areconnected to 4 KB GBLs via 4 KB transistors MLBLpe with the followingpreferred bias conditions:

-   -   a) 4 KB SEGe1=H1 but 4 KB SEGo1=0V,    -   b) 4 KB PREe1=4 KB PREo1=0V.        After this step, the 4 KB Even analog cell data are turned into        4 KB Even digital data with same polarity stored in 4 KB SA/s Qi        and QiB node as seen in SA circuit of 104 in FIG. 3. In other        words, Qi=Vdd when C_(LBL)=Vinh and Qi=Vss when C_(LBL)=Vss.

Arrow3o: This step is like 3e Step but to sense and readout 4 KB Odd SLCpage data stored in 4 KB corresponding Odd CACHE C_(LBL) capacitors inaccordance with the following bias conditions:

-   -   a) 4 KB SEGo1=H1 but 4 KB SEGe1=0V,    -   b) 4 KB PREo1=4 KB PREe1=0V.        After this step, the 4 KB Odd analog cell data being turned into        4 KB Odd digital data with same polarity stored in 4 KB SA/s Qi        and QiB node as seen in SA circuit of 104 in FIG. 3. In other        words, Qi=Vdd when C_(LBL)=Vinh and Qi=Vss when C_(LBL)=Vss.

FIG. 8B is a diagram showing NAND circuit structures and a method forperforming m-page MLC (MSB-page) Read operation with a Flag cell=1according to an embodiment of the present invention. Basically, them-page SLC and MLC (MSB-bit only) Read operation is m-page Readoperation of 2-Vt SLC cell or the first MSB-bit Read operation of a MLCcell without a LSB-bit Read operation because no LSB bit data is storedin one MLC physical cell in accordance with the control signals andcircuits of HiNAND2 array shown in FIG. 1A, Data Buffer shown in FIG. 3and Block-decoder and Segment-decoder shown in FIG. 2A and FIG. 2B.

It is the simplest m-page random-WL Read operation to distinguish E fromB′-state cells of the present invention by applying VR1 to m selectedWLs concurrently. It just needs one 8 KB pseudo CACHEcel C_(LBL)capacitors to temporarily store each 8 KB Read page data in one of mSegment's CACHEcel capacitors in one or more Groups in a Plane ofHiNAND2 array.

In an embodiment, this method deals with a 4-Vt MLC cell that storesboth MSB bit and LSB bits with a Flag cell=1 and under VR3 Readconditions. The VR1 and VR2 Reads have been done in previous operationsshown in FIG. 8A. Therefore, when Flag cell=1, this MLC Read operationwill need a 3-Read step with a preferred sequence from VR1, VR2, and VR3as the voltage on one selected WL is gradually increased per page. Form-page 4-stage MLC Read operation, then it will take m 3-Read steps forcompleting whole m 8 KB MSB page and m 8 KB LSB page Read operation.

Referring to FIG. 8B, the method includes five consecutive steps denotedby 1, 2, 3e, 3o, 4e, 4o, 5e, and 5o for performing the following threebasic operations: 1) Vinh precharging, 2) LBL discharging, and 3)Charge-sharing. Note, in the m-page LSB Read operation, totally fourpseudo CACHE registers such as 8 KB CACHEcel, 8 KB CACHEint, 8 KBCACHE1sb, and 8 KB CACHEmsb are required to properly flip the bit datain accordance with the accessed four states of each MLC cell.

step1: This is to perform the first operation, i.e., Vinh precharge.Only 8 KB C_(LBL) capacitors of 8 KB CACHEcel registers would beselected for implementing the Vinh precharge step as arrow1 indicates.The other CACHEs or Segments are disabled with the following biascondition:

-   -   I. 4 KB PREo1=H1 and 4 KB PREe1=H1: This is to precharge both 4        KB Odd and 4 KB Even C_(LBL) capacitors within one 8 KB CACHEcel        register at same time in 1-cycle time.    -   II. LBLps1=Vinh,    -   III. 4 KB PREo2=Vss and 4 KB PREe2=Vss,    -   IV. 4 KB PREo3=Vss and 4 KB PREe3=Vss,    -   V. 4 KB PREo4=Vss and 4 KB PREe4=Vss,    -   VI. 4 KB SEGe1=SEGe2=SEGe3=SEGe4=Vss,    -   VII. 4 KB SEGo1=SEGo2=SEGo3=SEGo4=Vss,    -   VIII. TIE1˜TIEL/2=Vss (L=4).

step2: This is to perform a second operation of C_(LBL) dischargingunder VR3 Read in accordance with each MLC storage states within oneselected 8 KB CACHEcel registers and following bias conditions:

-   -   a) 4 KB PREo1=PREo2=PREo3=PREo4=Vss    -   b) 4 KB PREe1=PREe2=PREe3=PREe4=Vss    -   c) 4 KB SEGe1=SEGe2=SEGe3=SEGe4=Vss,    -   d) 4 KB SEGo1=SEGo2=SEGo3=SEGo4=Vss,    -   e) TIE1˜TIEL/2=Vss (L=4),    -   f) LBLPs1=LBLPs2=LBLps3=LBLps4=Vss.

The reason to set all above control signals to Vss is because only oneCACHEcel is selected for VR3 discharge operation. As a result,E-cell=A-cell=B-cell=0V but C-cell=Vinh. The discharge operation onlyhappens within 8 KB local CACHE C_(LBL) capacitors without any need ofGBLs and LBLps Vinh power line at all.

step3e: This 3e Step is to perform C_(LBL) analog voltage sensing underVR1 from one 8 KB CACHEint registers only by one 4 KB Multiplier and one4 KB SA divided into 2 cycles for each 8 KB LSB. The first cycle is tosense, amplify, and transfer the 4 KB Even LSB data, and the secondcycle is to repeat steps on 4 KB Odd LSB data. The sensing cycle wouldbe performed between LBLs and GBLs. Therefore, the 4 KB divided GBLpaired transistors MLBLpe and MLBLPo of the 8 KB selected CACHEint haveto be turned on one by one basis due to the limitation of 4 KB GBLs forarea saving. Referring to FIG. 8B, the flow starts from an Even LSB pagefirst and then an Odd LSB page subsequently with 4 KB SEGe2=H1=Vinh+Vtnbut 4 KB SEGo2=Vss.

step3o: This step is to follow above step3e on the remaining 4 KB OddCACHEcel C_(LBL) capacitors with the opposite bias condition of 4 KBSEGe2=Vss but 4 KB SEGo2=H1=Vinh+Vtn. Basically, step2 and step3 is toperform LBL discharging due to E and B′ state evaluation under VR1 on mselected WLs.

step4e: This step is to sense and amplify the VR2 operation on 4 KB EvenCACHEmsb C_(LBL) capacitors with the following bias condition: 4 KBSEGe4=H1 but 4 KB SEGo4=Vss.

step4o: This step is to repeat above step4e for sensing and amplifyingthe remaining 4 KB Odd CACHEcel C_(LBL) capacitors with the oppositebias condition: 4 KB SEGe4=Vss but 4 KB SEGo4=H1. Basically, step4 is toperform charge-sharing (CS) between each metal1 shorter C_(LBL) and eachlonger metal2 C_(GBL) lines.

step5e: This step is to perform Even cell's C_(LBL) analog voltagesensing at 4 KB CACHEcel registers with the following bias condition: 4KB SEGe1=H1 but 4 KB SEGo1=Vss.

step5o: This step is to repeat above step5e for sensing the remaining 4KB Odd CACHEcel C_(LBL) capacitors with the opposite bias condition: 4KB SEGe1=Vss but 4 KB SEGo1=H1. Note, allSEGe2=SEGo2=SEGe3=SEGo3=SEGe4=SEGe4=Vss because CACHEint, CACHE1sb, andCACHEmsb registers are not selected in this step. In step5, it is toperform Sample and Hold (S/H) operation, which means a local C_(LBL)data are sensed and retained for each final 4 KB Even and 4 KB Odd pageof SLC or MLC's MSB page data.

FIG. 8C is a diagram of NAND circuit structures and a method forperforming m-page MLC (MSB/LSB page) Read operation with Flag cell=0according to an embodiment of the present invention. Basically, them-page MLC Read operation is to read each 4-Vt MLC cell stored with bothMSB-bit and LSB-bit in accordance with the control signals and circuitsof HiNAND2 array shown in FIG. 1A, Data Buffer shown in FIG. 3 andDecoders shown in FIG. 2A and FIG. 2B.

The MLC LSB Read is the most complicate m-page random-WL MLC Readoperation to distinguish four stored states of E, A, B, and C of eachMLC cell to determine LSB bit data. Totally three 8 KB pseudo CACHEssuch as CACHEcel, CACHEint, and CACHE1sb are involved in several logicoperations before obtaining the accurately reading of each logic LSBpage data from each 4-Vt MLC-WL. In an embodiment, this method dealswith a 4-Vt MLC cell that stores both MSB bit and LSB bit with a Flagcell=0 and under VR3 Read conditions. The VR1 and VR2 Reads are done inprevious methodology shown in FIG. 8A.

Therefore, when Flag cell=0, this MLC Read operation needs 3 Read stepswith a preferred sequence from VR1, VR2, and VR3 as voltage of oneselected WL is gradually increased per page. For m-page 4-stage MLC Readoperation, then it will take m×3 Read steps for completing whole m 8 KBMSB page Read operation and m 8 KB LSB page Read operation.

Referring to FIG. 8C, this preferred method includes 10 consecutivesteps denoted as 1, 2, 3e, 3o, 4e, 4o, 5e, 5o, 6, 7, 8e, 8o, 9e, 9o,10e, and 10o while involving several basic operations such as Vinhprecharging, LBL discharging and retaining, WL charging and discharging,LBL/GBL CS (Charge-sharing), and C_(LBL) LD/LT (C_(LBL)Loading/Latching). Note, in this m-page MLC MSB/LSB Read operation,totally three 8 KB pseudo CACHEs such as 8 KB CACHEcel, 8 KB CACHEint,and 8 KB CACHEmsb are required to properly flip bit data in accordancewith the accessed the 4-Vt MLC cell.

step1: This is a first step for implementing Vinh precharge, in whichthree 8 KB C_(LBL) capacitors of 8 KB CACHEcel, CACHEint, and CACHEmsbare selected and other CACHEs or Segments are disabled because no usagewith the preferred bias conditions:

-   -   I. 4 KB PREo1=H1 and 4 KB PREe1=H1 and LBLps1=V2=Vinh,        -   This is to precharge both 4 KB Odd and 4 KB Even CACHEcel            C_(LBL) capacitors with Vinh concurrently in 1-cycle time.    -   II. 4 KB SEGo1=4 KB SEGe1=Vss,    -   II. TIE1=0V,        -   To isolate CACHEcel from CACHEint to allow independent Vinh            precharging.    -   IV 4 KB PREo2=H1 and 4 KB PREe2=H1 and LBLps2=V2=Vinh,    -   V. 4 KB SEGo2=4 KB SEGe2=Vss,        -   This is to precharge both 4 KB Odd and 4 KB Even CACHEint            C_(LBL) capacitors with Vinh concurrently in 1-cycle time.    -   VI. 4 KB PREo4=H1 and 4 KB PREe4=H1 and LBLps2=V2=Vinh,    -   VII. 4 KB SEGo4=4 KB SEGe4=Vss,        -   This is to precharge both 4 KB Odd and 4 KB Even CACHEmsb            C_(LBL) capacitors with Vinh concurrently in 1-cycle time.

Again, the step is a self-timed one controlled by a VLBL Detectorconnected to one end of LBLps1 power line. Note, the Vinh precharging onthree CACHEcel, CACHEint, and CACHEmsb registers can be done in 1-cycleor in three consecutive cycles to reduce the peak current if necessary.

step2: This is a second step to perform LBL's Vinh discharging orretaining under VR1 Read in accordance with each MLC cell's storagestates within three CACHEs said above and following bias conditions:

-   -   I. 4 KB PREo1=PREo2=PREo3=PREo4=Vss,    -   II. 4 KB PREe1=PREe2=PREe3=PREe4=Vss,    -   III. 4 KB SEGe1=SEGe2=SEGe3=SEGe4=Vss,    -   IV. 4 KB SEGo1=SEGo2=SEGo3=SEGo4=Vss,    -   V. TIE1=H1 but TIE2=Vss,    -   VI. LBLPs1=LBLPs2=LBLps3=LBLps4=Vss.

The reason to set all above control signals to Vss is to ensure allC_(LBL) discharges can be done exclusively within each paired ofCACHEcel and CACHEint and CACHEmsb. After VR1 Read, E-cell=0, whileA-cell=B-cell=C-cell=Vinh. The Vinh discharge operation happens on allthree CACHEs' C_(LBL) capacitors. The 4 KB GBL bus lines are notinvolved, thus 4 KB GBL bus lines become available for any newoperations at this time interval. Note, since TIE1=H1, thus finalC_(LBL) voltages will be reflected in bother CACHEcel and CACHEint andare identical. In other words, 8 KB CACHEcel=8 KB CACHEint=VR1 pagedata.

step3: This step is to do VR2 Read with TIE1=Vss as well as all othercontrol signals to isolate 8 KB CACHEcel from 8 KB CACHEint. As aresult, CACHEcel exclusively stores the VR2 Read MLC data but CACHEintstill holds VR1 Read MLC data due to the isolation under TIE1=Vss.

step4e: This step is to perform a first 4 KB Even cell's C_(LBL) analogvoltage sensing from each 4 KB C_(LBL) capacitors by the Multiplier (102of FIG. 3) and SA (104 of FIG. 3) via LBL/GBL charge-sharing withfollowing bias conditions:

-   -   I. 4 KB SEGe1=H1 but 4 KB SEGo1=Vss        -   To connect 4 KB Even CACHEcel to 4 KB GBLs only for sensing.            This step has to be performed on 4 KB by 4 KB basis.    -   II. All SEGe2=SEGo2=SEGe3=SEGo3=SEGe4=SEGe4=Vss Because        CACHEint, CACHE1sb and CACHEmsb registers are not selected.    -   III. TIE1˜TIEL/2=Vss (L=4).        After this step, the 4 KB Even MLC cell analog data under VR2 is        fully amplified to a full digital bit data and stored at 4 KB        SAs. Note, VR2 page data is the MSB page data under VR2=Selected        WL, because VR2>Vtamax.

step4o: This step is to repeat above step4e but on 4 KB Odd CACHEcel. Inan embodiment, this step waits for the completion of 4 KB Even MLC databeing written back to 4 KB Even CACHEcel so that both SA and P/RB arenot occupied by 4 KB Even MLC bit data. The details are omitted here fordescription simplicity.

step5e: This step is to load and latch the prior 4 KB Even MLC MSB pagedigital data in 4 KB SAs back to 4 KB CACHEmsb C_(LBL) capacitors inaccordance with the with following bias conditions:

-   -   I. SEGe4=Vdd but SEGo4=0V and PREe4=Vss and TIE2=0V        -   Using SEGe4=Vdd, and if MSN=Vdd in SA, then the precharged            Vinh inside each corresponding C_(LBL) capacitor will be            retained. For example, during Write back, if each MSB in            SA=1=Vdd=GBL. Thus, each drain node of each MLBLpe=Vdd with            gate=SEGe4=Vdd, then the source node of CLBL1e=Vinh will not            leak to GBL due to MLBLpe transistor is biased into a            back-diode condition.        -   Conversely, if each MSB in SA=0=Vss=GBL. Thus, each drain            node of each MLBLpe=Vss with gate=SEGe4=Vdd, then the source            node of CLBL1e=Vinh will be discharged to Vss to GBL due to            MLBLpe transistor is biased into a conduction condition.            This is referred as BL voltage conversion between each            Vdd/Vss GBL to each Vinh/Vss LBL.    -   II. All SEGe2=SEGo2=SEGe3=SEGo3=SEGe4=SEGe4=Vss        -   Because CACHEint and CACHE1sb are excluded when CACHEmsb            registers are selected by SA to avoid GBL bus contention.        -   After this step, the 4 KB Even MLC cell analog data is being            fully amplified to a full digital bit data and stored at 4            KB SAs.    -   III. After LD/LT, then the gate of SEGe4=Vss to latch the new 4        KB Even MSB page data into 4 KB Even CACHEmsb.

step5o: This step is to repeat above step5e but load and latch 4 KB OddMSB page data into 4 KB Odd CACHEmsb with opposite control logic such asSEGo4=Vdd but SEGe4=0V and PREo4=Vss and TIE2=0V. The details areomitted here for description simplicity. Basically, step3,4,5 are toperform charge-sharing (CS) between each metal1 shorter LBLs and eachlonger metal2 GBLs.

step6: Only one 8 KB CACHEcel is selected for implementing 2nd Vinhprecharge under VR3 Read as indicated by Arrow1 for VR1 and VR2 Reads inprior steps. Again, this Arrow6 step is also a self-timed one controlledby each VLBL Detector connected to one end of LBLps1 line with the samebias conditions as in step1. Thus the details are omitted here.

step7: This is a step to perform m pages of 8 KB LBL's Vinh dischargingor retaining under VR3 Read in accordance with each MLC cell's storagestates within m 8 KB CACHEcel C_(LBL) capacitors. The bias conditionsare kept identical to those for Arrow1 step. Thus the details areomitted here for description simplicity. Note, after the Arrow6 stepabove, MLC page data bit patterns of each 8 KB CACHEcel capacitor isreferred as C-state. It is meant that E=A=B=0V but C=1=Vinh stored in m8 KB CACHE C_(LBL) capacitors. After the step7, three CACHEs store threedifferent MLC page data as summarized below.

-   -   I. 8 KB CACHEcel=C-state under VR3 Read,    -   II. 8 KB CACHEint=A-state under VR1 Read,    -   III. 8 KB CACHEmsb=B-state under VR2 Read.

The next steps referred to Arrows 8e, 8o, 9e, 9o, 10e, and 10o in FIG.8C are to sense and amplify all above stored Analog MLC page data fromthree respective CACHEc1, CACHEint, and CACHEmsb to SA, P/RB, and CAP1and CAP2 in DB for MLC MSB and LSB bit-flipping logic operations. These3 Read operations are performed on 4 KB by 4 KB basis, the same aspreviously described step4e and step4o. More details will be describedin subsequent flows shown in FIGS. 8D through 8G below.

The subsequent m-page SLC and MLC Read flows are basically designed inaccordance with above m-page MLC Read Methodologies. Since HiNAND2 arraystructure and the above Methodologies are highly flexible, the followingflows are merely examples for the illustrative purpose without limitingthe scope of claims.

FIG. 8D is a flow chart of a method for performing m-page SLC Readoperation according to an embodiment of the present invention. The flowstarts from receiving a SLC m-page command and m selected page Addressesand ends with outputting m 4 KB SLC Even and m 4 KB SLC Odd page datasequentially to 8 I/Os on 1-byte by 1-byte basis. Each 8 KB SLC page ineach selected physical 8 KB SLC-WL stores 8 KB 2-Vt cells storing eitherE-state=1 or A-state=0 when a preferred VR1 is coupled to m selected SLCWLs during the concurrent m-page SLC Read operation of the presentinvention. As shown, the m-page 2-Vt SLC Read flow is divided into twosub-flows including a first m-page Even 2-Vt SLC Read and I/O outputflow and a second m-page Odd 2-Vt SLC Read and I/O output flow.

Step 800: This step is to sequentially receive, load and decode them-page SLC Read Command and its associated m pages' of Read Addresses inunit of byte via NAND's 8 I/Os from off-chip Flash Controller intoNAND's designated Command and Address Buffers (not shown). In addition,m latches of m selected Even Segments and m selected Odd Segments and mBlock-decoders are also set according to the m-page Addresses stored inm Address Buffers for concurrent m-page SLC Read operation.

For example, the new SLC Command is loaded into the designated Commandregister so that it can be decoded and the associated SLC Read flows canbe initiated immediately. Similarly, m-page Addresses are loaded into mdesignated on-chip Address Buffers conjunction with other controlcircuits (not shown) to set the corresponding m Even and Odd Segments'latches as shown in FIG. 2B and m Block latches as shown in FIG. 2A ofthe preferred HiNAND2 array (FIG. 1A).

Moreover, m addressed 8 KB SLC NAND page data are divided in m 4 KBEven-page SLC data and m 4 KB Odd-page SLC data. These m pages of SLCdata are selected concurrently by m Segment latches with m Blocklatches. Other than that a SLC Command is newly proposed in NAND design,m-page Address arrangement is also introduced in the present invention.In prior-art, usually only one page address in one selected NAND planeis allowed, thus only one SLC page is specified in common NAND Readcommend. But in this HiNAND2 array, m page Addresses in every selectedNAND plane are allowed to provide flexibility of specifying up to mpages of Addresses in this SLC Read command. Each of m page Addressarrangement is like prior-art single page address arrangement to placefew bytes of column address first and then followed by few bytes of rowaddress or vice versa. The major difference is that m pages of Addressescan be cascadedly loaded between start and end of m-page SLC ReadCommend due to m-page Read operations according to embodiments of thepresent invention, rather than single page SLC Read in prior-art NAND.

For example, m pages of addresses are loaded such that the first page'srow Address is followed by first page's column Address, then the secondpage's row Address is followed by second page's column Address, and thenlastly the mth page's row address is followed by mth page's columnaddress and the end code. Note, since this preferred m-page SLC Read isstill a Read operation, thus there is no need to load any page data fromthe external Flash Controller into NAND flash as a Program operation.

Step 801: This step is to perform the preferred a plurality of LBLprecharging of m selected 8 KB pseudo CACHEcel registers made of m 4 KBmetal1 Even and 4 KB metal1 Odd C_(LBL) capacitors with a preferred MHV(Medium-High-Voltage) Vinh (≧7V to 10V). Note, in this SLC Block Read,it just needs one type of m pseudo CACHEcel registers of m metal1Segment C_(LBL) capacitors in one or more HiNAND2 Groups.

The LBL precharging of Vinh voltage is like NAND's SLC Read operationthat requires a BL precharging with ≈1V of Vdd-Vt on long BLs in thebeginning of each single-page Read operation. But this preferred m-pageSLC Read operation only needs to precharge m pages of 1/(L×J) shorterLBLs with less power consumption.

In an embodiment, following a 2-step sensing and amplification of theanalog cell signal by a preferred Multiplier and Latch-type SA, asubsequent operation involves a step for charge-sharing between C_(LBL)and C_(GBL) capacitors. The higher Vinh voltage over conventional 1.0Vwould guarantee a reliable sensing of NAND stored data and states. Note,Vinh voltage of about 7 to 10V here is not used as a Program-Inhibitvoltage as for SLC and MLC Program BL's voltage. The selected 8 KB NANDcells per one physical WL are within one selected Block of one selectedSegment that comprises 8 KB pseudo CACHE C_(LBL) capacitors. Any pseudoCACHE register is termed as a CACHEcel when 8 KB selected cells of oneselected full WL are within it.

Associated with the Step 801, a preferred set of bias conditions inaccordance with HiNAND2 array circuit 200 shown in FIG. 1A is listedbelow:

-   -   a) TIE=DIVen=CSL=SEGo=SEGe=0V:        -   TIE=0V is to shut off NMOS transistor MLBLb so that only one            8 KB CACHEcel out of two jointed CACHEs is selected for LBL            precharging to save power.        -   SEGo=SEGe=0V are to prevent one paired 4 KB Odd and 4 KB            Even C_(LBL) from leakage to one shared corresponding 4 KB            metal2 GBLs.        -   CSL=0V is a regular setup for a normal NAND string Read            operation.    -   b) PREo=PREe=H1:        -   This is to turn on both MLBLso and MLBLse transistors so            that Segment power supply of Vinh can be coupled from            selected LBLps lines to the selected 8 KB CACHEcel's C_(LBL)            capacitors.    -   c) LBLps=Vinh is supplied by a central Vinh MHV pump circuit        (not shown).        -   The m 8 KB CACHEcel's Odd and Even C_(LBL) precharge-time is            controlled by on-chip State-machine design.

The m 8 KB CACHEcel precharge-time is controlled by a self-timed LBLpsVinh Detector circuit. This is done by using one shared LBLps line as aVinh supply line as well as a Vinh sensing line. The Vinh supply linecomes from one end of LBLps line connected to Vinh Driver but VinhDetector operates at another end of LBLps line. Once the LBLps linereaching Vinh, it means m 8 KB C_(LBL) capacitors in CACHEcel are fullwith Vinh so that the Vinh Detector will issue a signal to on-chipState-machine to stop Vinh precharge operation. This Vinh precharge timethus can be very accurately and automatically controlled.

Step 802: this step is to determine if the 64XTs+1SSLp+1GSLp bus linesare free from being occupied for any concurrent NAND operation. If No,the flow will be idle and wait their status change. If Yes, the flowmoves to next step 803.

Step 803: In addition to m 8 KB CACHE C_(LBL) concurrent precharging,each selected Block's WL precharging of one selected WL, 63 unselectedWLs, one SSL, and one GSL can be executed at the same time respectivelywith Vtamin, Vread, Vdd, and Vss. One self-timed Vread Detector (FIG.9B) at end of one dummy WL automatically and accurately controls eachWL-precharge period. The reason to detect Vread only for one NAND Blockcomprising one set of outputs of 64 WLs, one SSL, and one GSL is becauseVread is the highest and slowest-charged WL voltage during Readoperation. Once the Vread voltage reaches to the set level with somemargin time in dummy WL, then it indicates all selected set of 64 WLs,one SSL, and one GSL in each of m selected Blocks have reached to thedesired Vread, Vtamin, and Vdd voltages respectively. The dummy-WLDetector will generate a signal to initiate a 2-step WL, SSL, and GSLvoltage-latching process once the Vread is detected. Right after aself-timed CACHEL C_(LBL) discharging step automatic control is startedto reduce string stress.

For the precharging of m sets of one selected WLs, 63 unselected WLs, 1SSL, and 1 GSL lines in Block-mode SLC Read, it can be performed onblock-by-block basis at a same or different timelines to save the WLprecharge time after receiving confirmation of no usage of targeted XTsspecified in the Step 802. The WLs and C_(LBL) precharging operationscan be carried out almost simultaneously in Steps of 801 and 803. Thedesired WL-related bias voltages are listed below:

-   -   a) Selected-WL (m selected WLs in m selected Blocks):        V_(s-SL)=Vtamin=0V.    -   b) The reason to use Vtamin for a SLC Read WL Verify-voltage is        because 2-Vt SLC has a wide ΔVt gap between Vtamax and Vtbmin.        Actually, any SLC Read voltage on m selected WLs can be any        value between Vtamin to Vtamax. But Vtamin value is more        preferred due to the consideration of balanced ΔVt margin        between E-state and B′-state of a 2-Vt SLC cell.    -   c) Unselected-WL (m unselected 63-WLs in m selected blocks):        V_(uns-WL)=Vread=6V.    -   d) SSL (m unselected SSL lines in m selected Blocks):        V_(SSL)=Vdd.    -   e) GSL (m unselected GSL lines in m selected Blocks):        V_(GSL)=0V.

Note, Step 803 is carried out only after receiving confirmation in Step802 that the targeted Read Block has not been selected for other m-pageconcurrent operation. In addition, each initiation of precharging 64WLs, 1 SSL, and 1 GSL of any one of m selected Blocks is carried bypre-designed command but the precharging time is accurately controlledby dummy-WL Detector. Note, precharging of 64WLs+1SSL+1GSL of m randomBlocks is carried out on block-by-block basis. If all m Blocks have samelocations of the selected WL in m NAND Blocks, then the precharging ofm-page WLs can be performed in 1-cycle. Otherwise, m random page Readoperations need m precharging cycles to set m separate sets of64WLs+1SS+1GSL to the desired voltage levels. In conclusion, in Step803, only one selected Block's 64WLs+1SSL+1GSL are precharged.

Step 804: This step discloses a self-timed WL voltage latching operationfor one of m selected sets of 64 WLs, 1 SSL, and 1 GSL of m selectedNAND Blocks during the predetermined interval of cell-state evaluationstage in m-page SLC Read operation according to an embodiment of thepresent invention.

In conventional NAND spec, there is no such spec of WL-precharge time.The precharge time is included in total SLC Read time spec of 25 μs thatcovers several sub-steps including initial 1.0V BL precharge, then 6VVread and VR WL precharge and lastly the BL discharge during cell-stateevaluation. In other words, there is no need of any accurate controlmethod over one single-WL precharge time because there is only single-WLSLC Read operation. All the single-page Read operation of one simpletask can be easily controlled by an on-chip State-machine design.

But when it comes to the m-page SLC Read, m independent SLC page Readoperations are started and ended at m different timelines with same timeduration for each page of SLC Read operation, it is preferred to have amore accurate, self-timed control of a more efficient Read cycle timefor each independent WL or page. Once each WL is started the prechargingoperation at a different timeline, the dummy WL Vread Detector willautomatically end the precharging of a selected WL in each of m selectedpages at a different timeline.

In order to achieve more accurate and secure WL precharged voltage andtime control for each independent sets of 64 WLs, SSL, and GSL, thisinvention uses three dummy WLs with exactly identical layout and lengthof a regular WL but only the middle dummy WL is used for thisWL-precharge delay tracking purpose. The reason to have two extraadjacent un-used dummy WLs is to ensure same parasitic inter-WLcapacitance are counted into the precharge-time calculation.

In an embodiment, the Vread Detector is made of one 2-input DifferentialAmplifier (DA), to be shown in FIG. 9B. One input of DA is connected tothe end of this middle dummy WL and the other input is connected toVread that is generated from Vref generator. This novel Vref circuit cangenerate varied reference voltages such as Vpgm, Vpass, Vread, and VRnfor respective highest WL voltages in respective Program,Program-Verify, and Read operations. The highest WL voltage would betake longest precharge time. Thus, once the highest WL voltage for aparticular WL operation is precharged, it is meant all other selected 63WLs, 1 SSL, and 1 GSL have been well precharged at the desired voltagelevel.

For this m-page SLC Read operation, the highest WL voltage is Vreadapplied on 63 unselected WLs per one of m selected Blocks. Thereby, aVread-ΔV voltage is switched on to connect to one end of above saidVread WL Detector. Upon the detection of a full-precharged WL voltage ofVread that is higher than the Vread-ΔV voltage, then DA's output willissue a signal to one correspondingly selected SLC page Address of oneof the m selected Blocks to latch the well-precharged voltages of 64WLs, 1 SSL, and 1 GSL lines concurrently on those parasitic WLcapacitors with extra 100 ns-500 ns margin delay to final precharged WLto reach Vread when ΔV is set to be 0.5V. Upon the latching moment, aself-timed LBL-discharge operation is immediately initiated inaccordance with the LBL-discharge Detector circuit.

Step 805: This step is another self-timed operation of this preferredm-page SLC Read to perform Vinh discharging or retaining operations inaccordance with one of m selected 8 KB CACHEcel C_(LBL) cell patterns aswell as Step 801 for all BL precharging. For saving Read time and WLVread stress, one 4 KB Even and 4 KB Odd C_(LBL) capacitors and cellsare selected in each 8 KB CACHEcel register. It is like single-pageAll-BL Read operation. But in certain embodiments of the presentinvention, m All-BL Read operations will be eventually performed in apipeline manner. One WL by one WL per Block is sequentially selected forSLC Read at different time line, thus one WL by one WL discharging andretaining of Vinh are performed here.

In order to control C_(LBL) discharge time automatically and accurately,one VLBL Detector per one 8 KB CACHEcel capacitors is using one metal0LBLps power line. It is like a conventional CAM's sense line but withouttaking extra array layout overhead and precharge power consumption,because each LBLps line has to be coupled to a Vinh Driver initiallywhen it is selected for C_(LBL) precharging in Step 801. In this VLBLDetector, we keep the precharged Vinh voltage after precharging forsubsequent VLBL Detector operation without an extra precharge stepagain. Thus, the power consumption of VLBL detection can be saved.

The actual C_(LBL) discharging operation of each selected Block startswhenever VR1, Vread, Vdd, and H1 precharge voltages are respectivelyhigher than Vtemax and Vtamax of selected NAND cells and the Vts ofstring-select NMOS transistors MS and MG in one selected NAND string.Once the V_(LBL) is detected at desired level such as 2.0V, then theVLBL Detector is responded to shut off the Detector function with somepredetermined At delay to prevent the LBL leakage of Vinh-retainedC_(LBL) capacitors from C_(LBL)-discharged capacitors in same CACHEcelregister as well as to ensure the discharged C_(LBL) voltage nears 0V.

The final value of each C_(LBL) bit voltage is determined by each NANDcells' state. If NAND cells are in E-state, then the correspondingC_(LBL) Vinh voltage will be discharged to Vss, otherwise C_(LBL)voltage would retain the initial Vinh voltage if those NAND cells areB′-state with Vtb′min>Vtamin. The desired bias WL voltages are listedbelow.

-   -   a) S-WL (m selected WLs in m selected Blocks): V_(s) _(—)        _(WL)=Vtamin    -   b) Uns-WL (m unselected 63 WLs in m selected Blocks):        V_(uns-WL)=Vread=6V    -   c) SSL (m unselected SSL line in m selected Blocks): V_(SSL)=Vdd    -   d) GSL (m unselected GSL lines in m selected Blocks): V_(GSL)=H1        or Vdd.

A preferred set of bias conditions in J Groups is the same as shownbelow: a) TIE=DIVen=CSL=SEGe=SEGo=0V; b) PREe=PREo=0V and LBLps=0V.Note, V_(GSL)=Vread is to turn on NAND string to allow NAND cell currentflow if E-state cells are detected.

Step 805 takes a shorter time to discharge m 8 KB C_(LBL) capacitorsfrom Vinh voltage to Vss=0 than conventional NAND BL discharge step dueto same large resistance of each NAND cell string plus a long but alighter C_(LBL) capacitance. In certain embodiments of the presentinvention, although C_(LBL) capacitor is precharged to a Vinh voltagehigher than 1.0V used in prior art, each C_(LBL) discharge time is stillmuch faster than conventional C_(BL) because the value ofC_(LBL)=1/(LxJ) C_(BL), where L stands for L Segments per one Group andtotal J Groups per one plane. The C_(LBL) is one local metal1 LBLcapacitor, while C_(BL) is one global metal1 conventional BL capacitor.

Note, although the first selected SLC Read operation is performed at½-page 4 KB Even page but the discharge of B′-state evaluation is donein one full physical WL that contains both 4 KB Even page and 4 KB Oddpage concurrently because of the WL sharing. In this manner, 2× fasterSLC Read operation can be achieved because one of the key SLC Readbottleneck is the LBL discharge time.

Step 806: This step is to perform a self-timed 64-WLs concurrentdischarge once the 8 KB C_(LBL) discharge operation per one selectedCACHEcel is complete in Step 805. The purpose of this step is to reducem 63-WLs gates' Vread disturbance issue once m 8 KB SLC Read data arewell developed and available at m 8 KB CACHE C_(LBL) capacitors forsensing and amplification by each Multiplier and each SA. The preferredsets of bias conditions are listed below:

-   -   a) ENB=1, CLA=CLR=ENS=0,    -   b) XT1˜XT64=SSLp=GSLp=0V    -   c) CLWL=one-shot pulse of Vdd.

The reason to set bias condition b) is because the discharges of WLs,SSL, and GSL have to go through XT1 to XT64, SSLp, and GSLp commonvertical bus lines in accordance with the Block-decoder circuit shown inFIG. 2A with HXD node set to Vdd by the status of Block-decoder latchmade of INV3 and INV4 in conjunction with the selected address stored inan Address Buffer and a self-timed control circuit (not shown).

In an embodiment, right after C_(LBL) discharge step, HV Vread voltageis immediately discharged for non-selected 63 WLs in the selected Blocksfor reducing Vinh WL stress in each SLC Read operation because there isno more Read for that selected SLC page. But in the subsequent MLC Readoperation, right after the first MSB bit Read, no WL discharge isperformed because there are two more subsequent Read operation of VR2and VR3 are required to complete 3-Read MLC Read operation for saving WLcharging power.

Step 807: Now, one of m 4 KB Even SLC page analog data developed inaccordance with one of 4 KB Even NAND cells within one 4 KB EvenCACHEcel C_(LBL) capacitor would be sequentially sensed and amplified byboth 4 KB corresponding Multipliers and SAs to perform 2-step analogamplifications and the final 4 KB Even SLC digital data would be storedin 4 KB SAs on 4 KB ½-page by ½-page basis due to the limitation of 4 KBmetal2 GBL bus lines. Each readout SLC data bit of Qi=0/1 and QiB=1/0 ateach SA is used to set each corresponding P/RB bit with same polarity ifeach corresponding stored charge voltage in CACHEcel=0V/Vinh inaccordance with the DB circuit shown in FIG. 3 and the set of thefollowing bias conditions:

-   -   a) DIVen=SEGe (in CACHEcel)=H1,        -   This is to connect the broken-GBL transistors MGBL to            provide a way to connect each sensed but diluted C_(LBL)            voltage to each corresponding Multiplier via the charge            sharing between with each corresponding GBL.    -   b) TIE=CSL=SEGo=SEGe (in CACHEint)=0        -   This is to shut off the leakage path through MLBLb with its            gate tied to TIE between each paired CACHEcel and CACHEint            C_(LBL) capacitors so that the sensed analog cell signal at            CACHEcel would not be diluted between paired CACHE C_(LBL)            capacitors.    -   c) Voutp (high/low)=Vref+/−ΔV,    -   d) T5=one reverse pulse of Vdd,        -   T5 clock is used to do a second analog amplification after a            first analog amplification done by Multiple and finally to            latch the fully amplified digital cell data at SA.

Step 808: This step is to transfer 4 KB Even SLC page data stored in SAto 4 KB P/RB with a reversed polarity because each SLC E-state cell'sanalog LBL voltage is 0V but the digital logic data is “1” indefinition. Conversely, each SLC B′-state cell's analog voltage is Vddbut in logic data is “0.” Thus, the readout bit data from SA to P/RB hasto be flipped before it is sent out to a Flash Controller via 8 I/Os.

The Di=0/1 in each P/RB if each corresponding Qi=1/0 in each SA. Thepreferred sets of bias conditions are listed below:

-   -   a) IDAB=ENMLSB=0V, Above conditions because it is in SLC Read        mode, not MLC mode.    -   b) PGM=IDB=0V        -   PGM=0V is because this is a Read mode, not Program-mode.    -   c) IDC=WBK=one-shot pulse of Vdd with T5=1=Vdd.

The above conditions are to turn on paired NMOS transistor 19 gated byIDC and NMOS transistor 16 gated by WBK along with the NMOS transistor18 gated by Qi and NMOS transistor 17 gated by QiB so that node of DiB=0and Di=1.

Step 809: This step is then to check if 4 KB real CACHE registers arenot occupied and available for receiving one 4 KB SLC Even page data in4 KB P/RB. If the check result is Yes, then the transferring of 4 KBEven SLC page data from P/RB to CACHE in 1-cycle would be executedimmediately and then RDY pin would, at Step 810, set to a Low valuetemporarily to inform off-chip Flash Controller. If the check result isNo, then 4 KB Even SLC page data is looped to wait for the release of 4KB real CACHE engaging in other operations.

Step 811: Before transferring 4 KB Even SLC bit data from 4 KB P/RB to 4KB real CACHE, each CACHE latch bit has to be equalized first to be ½Vddfor each paired nodes of DIOi and DIOiB by a bias condition b) below. Inother words, DIOi=DIOiB=½ Vdd with following bias conditions:

-   -   a) RD=LD=0V,    -   b) LAT=EQ=1=Vdd,    -   c) All gates of Y-pass=0V to prevent the leakage to I/O pad.

Step 812: After equalizing, then the transferring can be executed on 4KB basis in 1-cycle. If DIOi=0/1, then Di=1/0 with a reverse polarity asexplained before with following biased conditions.

-   -   a) WBK=IDC=IDAB=ENMLSB=0V,    -   b) EQ=LD=PGM=IDB=0V,    -   c) All Y-pass gate=0V,    -   d) LAT=1 and RD=H1.

Step 813: This step is to latch 4 KB Even SLC page data in 4 KB P/RBdata into 4 KB CACHE register (CR) with following bias conditions:

-   -   a) WBK=IDC=IDAB=ENMLSB=0V,    -   b) EQ=LD=PGM=IDB=0V,    -   c) All Y-pass gate=0V,    -   d) LAT=Falling edge and RD=H1 falling edge to latch the data in        4 KB CACHE.

Step 815: Upon the completion of each 4 KB SLC Even page datatransferring from 4 KB P/RB to 4 KB CACHE, then 4 KB CACHE data is readyto be sequentially sent out to Flash controller via NAND's 8 I/Os. Atthis stage, the RDY pin is then reset to high (=1) to report the 4 KBEven SLC page data transferring is done.

Step 816: This step is to clock out each 4 KB SLC Even page data storedin 4 KB CACHE registers to Flash controller via 8 I/Os in unit of byte.Thus, 4 KB SLC Even page data would take 4K cycles.

Steps of 817 and 818: These two steps are to repeat the sensing,amplifications and transferring steps of 807 to 815 for the remaining 4KB SLC Odd-page Read as above 4 KB Even SLC page data. As seen in Step805, the 4 KB Odd SLC page data and 4 KB Even SLC page data areavailable on the same time of when same SLC-WL is coupled to the desiredvoltage level as explained above. Therefore, the 4 KB Odd SLC stored in4 KB Odd CACHEcel C_(LBL) capacitors can be sensed, then amplified andoutputted through I/O sequentially from Step 805. Note, the arrows offlows show the steps of 816 and 817 can be executed on the same time andthe steps of 818 and 819 can also be done concurrently.

At Step 819, if all m 8 KB pages of 4 KB Even and 4 KB Odd SLC databeing sensed and sent out successfully, then the m-page concurrent SLCRead operation is completed at Step 821. Otherwise, the SLC Readoperation is continued to next one of m selected pages at Step 820 andis jumped to Step 801, which is the initial step of this m-page SLC Readflow.

Subsequently, MLC MSB page data and LSB page data Read flows areprovided according to certain embodiments of the present invention.Unlike m-page SLC Read flow where each physical page is to read only onelogic page of SLC data from each physical SLC-WL, each physical page ofMLC Read is to read two logic pages of MLC data such as MSB page and LSBpage from each physical MLC-WL. For m-page 8 KB MLC Read operation, eachpage is divided into 8 KB MSB Block Read and 8 KB LSB Block Read. Andeach full 8 KB MLC MSB logic page is then further divided into two 4 KBsub-logic pages such as one 4 KB MLC Even MSB logic page and one 4 KBMLC Odd MSB logic page. In other words, each MLC physical NAND cellstored two logic bits such as MSB bit and LSB bit.

During this preferred m-page MLC MSB-bit Read operation, there are twocases to make each MSB Read bit data different. The first case, a) Flagcell=1. In this case, it means each accessed MLC cell only stores 2-VtMSB bits. The LSB bit data is not programmed yet. Therefore, in thiscase, a m-page 2-Vt MLC MSB Read is identical to a m-page 2-Vt SLC Readas explained in FIG. 8C.

For example, each 2-Vt SLC Read is to distinguish the programmed B-statecell from the erased E-state cell in one 2-Vt physical SLC NAND cell.Similarly, each MLC MSB

Read with Flag cell=1 is also to distinguish one programmed B′-statecell from one erased E-state cell in one 2-Vt physical pseudo-SLC NANDcell. In other words, each MLC physical cell is supposed to store 4distinct states of one negative-Vt erase E-state, and three positive-Vtprogrammed A-state, B-state, and C-state but upon the time of MLC MSBRead, only 2-state is stored within each MLC physical cell. Thus, this2-Vt MLC cell is termed as a pseudo-SLC cell. Each MLC MSB bit datadivides each MLC cell into 2 Vt states such as E-state=“1” andB′-state=“0.” It is like each SLC bit data divides each SLC cell into 2Vt states such as E-state=“1” and B-state=“0.” As a result, when Flagcell=1, MLC MSB Read=SLC Read. The only one difference is the Verify WLvoltage.

Thereby, the m-page MLC MSB Read with Flag cell=1, in principle, theflows are same as last SLC Read flow shown in FIG. 8C except for theWL-Verify voltage. In SLC Block Read, a Vtamin on selected WL is used todifferentiate B-state from E-state, while MLC MSB (pseudo SLC) Read, apreferred VR1 on selected WL is used instead to differentiate B′-statefrom E-state. The value of VR1 is less than Vtamin as specified inprevious 4-Vt distribution chart shown in FIG. 4B for better noisemargin because the ΔVt gap between E-state and B′-state in MLC MSB Vtdistribution scheme is larger than the SLC Vt distribution scheme.

The second case, b) Flag cell=0. In this case, it means the accessed MLCcells in the selected MLC WLs have already stored both MSB bit ad LSBbit. Under this scenario, each MLC MSB bit Read with Flag=0 will bedifferent from each MLC MSB bit Read with Flag=1. The former one withFlag=0 is to read one MLC MSB bit out from a real 4-Vt MLC cell thatstores 2 bits. The latter one with Flag=1 is still to read out one MLCMSB bit and one MLC LSB bit from a pseudo 2-Vt MLC physical cell. The 8KB MLC MSB bit has data either “0” or “1” but the second 8 KB MLC LSBbit data is all “fixed” to be “1.”

In an embodiment of the present invention, 8 KB MLC MSB Read operationis divided into a Read of 4 KB Even MLC MSB logic page and a Read of 4KB Odd MLC MSB logic page. Each 4 KB MLC MSB Read takes 4K cycles to besequentially clocked out to Flash Controller via real CACHE and then 8I/Os, regardless of Odd or Even MLC MSB logic page.

Conversely, each 4 KB MLC LSB Read cycle is subject to Flag cell data.Whenever Flag=1, all 4 KB MLC LSB Read clock cycle is “1”, not “4K”cycles to save the time. It is done by issuing a one-shot RDY signalfrom Vdd to Vss to inform off-chip Flash controller. Once Flashcontroller sensed RDY one-shot signal, then the Flash Controller willencode which NAND chip issues the RDY signal. Each RDY signal islogically assigned with a different address in some Flash control systemdesign. Then it is proposed that the Flash Controller will find out theNAND flash address and then to find out the instruction from a FlashStatus Register. If the Status Register indicates that the currentaddressed MLC has no LSB data, then the Flash Controller will be awareof no need of further 4K-1 cycles to read out the remaining all “1” LSBdata to save power and to free I/O interface bus lines. One cycle of MLCLSB of all “1” is good enough. The following flows of MLC MSB and SLBRead follow above predetermined schemes.

Note, the reason to use a lower value of VR1 to replace Vtamin for MLCMSB Read is because for a faster MLC Read that may need to do Readcycles using 3 different WL's voltages of VR1, VR2 and VR3 todistinguish a 4-Vt MLC cell if subsequent check with Flag=0. It isfaster with fewer BL precharge steps if the selected WL voltage isgradually increased from the lowest VR1, then VR2, and VR3. Therefore,only one BL precharge step is required for 3 BL discharges at 3 Read WLvoltages of VR1, VR2, and VR3.

As SLC-WL and MLC-WL defined previously, whenever a MLC cell is read,Flash Controller does not know whether the addressed MLC WL containsonly MSB bit or both MSB and LSB bits. It relies on checking out theFlag cell physically located in the same MLC WL as 8 KB MSB bits.Therefore, it is better to start from VR1 Read to read a MLC cell as afirst trial. At the same time, each Flag cell data is also read out fromseparate Flag cell's SA, P/RB, and real CACHE register in conjunctionwith 8 KB MSB bits. If Flag=1, it means that the addressed MLC WL has noLSB data but only MSB data being programmed. In that case, then the Readflow of MLC MSB (2-Vt only)=SLC (2-Vt) with Flag cell=1. Therefore thism-page MLC MSB Read follows the steps in accordance with m-page SLC/MLCMSB Read Methodology and flows with a condition of Flag cell=1 as shownin FIG. 8A.

On the contrary, if Flag cell=0, then it means the addressed MLC cellhas already stored both MSB and LSB bits, e.g., 4-state. As a result,the Read MSB bit data has to be readjusted to get the accurate MSB datain accordance with the flow shown in FIG. 8B for m-page MLC MSB/LSB PageRead Methodology with a condition of Flag cell=0.

FIG. 8E is a flow chart showing a method for performing m-page MLC Readoperation according to an embodiment of the present invention. The flowstarts from receiving a MLC m-page Command and m selected page Addressesand ends with outputting m 4 KB MLC Even MSB and m 4 KB MLC Odd MSB pagedata to 8 I/Os on one-4 KB by one-4 KB basis per 1-cycle if a Flagcell=1 in each selected MLC-WL that indicates each MLC physical cellstoring only LSB bit.

When a Flag cell=0 in each selected MLC-WL, then this flow shows thereading and outputting of both 8 KB logic MSB page data and 8 KB logicLSB page data that are stored in each of m selected MLC physical WLs orpages via reading, sensing and charge-sharing, sample- and hold andamplification in 8 KB C_(LBL) lines and 4 KB C_(GBL) lines.

In a specific embodiment, this is a first sub-flow of a preferred m-pageMLC MSB Read flow that is based on Flag=1 status as explained above.Thereby, no LSB bit but only MSB bit is stored in the addressed MLC-WLs.Therefore, this sub-flow basically is identical to the m-page SLC Readoperation for this m-page MLC MSB (pseudo-SLC) 2-Vt Read operation.

There are two differences in flow design between the previously shown2-Vt SLC Read and the current 2-Vt MLC MSB Read. Firstly, the MSB Readinvolves up to two reads such as a first VR1 Read and a second VR2 Readif Flag=1 that indicates no SLB is programmed in the addressed MLC-WL.But prior SLC Read needs only one Vtamin read. Thus, only CACHEcelregister is sufficient to store the page data. Secondly, both VR1 andVR2 data have to be kept in two 8 KB CACHE C_(LBL) capacitors forsubsequent MLC LSB bit-flipping operation throughout a whole MLC Readcycle. In prior 2-Vt SLC Read, only one 8 KB CACHEcel register issufficient because no bit flipping steps are required to get accurateSLC page data. In contrast, this MLC Read cycle, three reads such asVR1, VR2, and VR3 are required. Thus 3× of temporary data have to bekept for bit-flipping operations. Thus more pseudo 8 KB CACHE registersare required. For this 4-Vt MLC Block Read, two CACHE registers such asCACHEcel and CACHEint are required. The MLC MSB 2-Vt Read uses VR1 butSLC 2-Vt Read uses Vtamin to distinguish E-state from others.

Referring to FIG. 8E, as SLC Read, this MLC MSB Read flow starts fromStep 822 to sequentially receive, load, and decode a MLC Read Commandand its associated m pages of Read Addresses in unit of byte via NAND's8 I/Os from an off-chip Flash Controller into NAND's designated Commandand Address Buffers (not shown). In addition, m latches of m selectedEven and m selected Odd Segments and m Block-decoders are also setaccording to the m-page Addresses stored in m Address Buffers forconcurrent m-page MLC MSB-only Read operation.

For example, the m-page MLC Read Command is loaded into the designatedCommand register so that this new MLC Command can be decoded and theassociated MLC Read flows can be initiated immediately. Note, as definedpreviously, once the MSB Block Read Command is decoded, it always startsfirst from regular MLC MSB bit Read along with a Flag bit to second LSBRead. If Flag=1, the m-page MLC Read will only do the 8 KB MSB Read butno 8 KB LSB byte Read in array but 8 I/Os would be fixed with one byteof “FF” data to end one page of MLC Read operation. If Flag=0, then them-page MLC Read will do both 8 KB MSB Read as well as 8 KB LSB Read.

Similarly, m-page Addresses are loaded into m designated on-chip AddressBuffers in conjunction with other control circuits (not shown) to setthe corresponding m Even and Odd Segment latches as shown in FIG. 2B andm Block latches as shown in FIG. 2A of the preferred HiNAND2 array (FIG.1A).

Moreover, m addressed 8 KB MLC MSB page data are also divided into m 4KB Even-page MLC MSB data and m 4 KB Odd-page MLC MSB data. These mpages of MLC data are selected concurrently by m Segment latches with mBlock latches. Conventionally, only one MSB page Address in one selectedNAND plane is allowed at a time, thus only one MLC page is specified incommon NAND Read commend with no difference of SLC and MLC commands. Butin embodiments of the present invention, the HiNAND2 array provides thatm pages of MLC MSB addresses in every selected NAND plane are allowed.Thus a flexibility of up to m pages of addresses can be specified inthis MLC MSB Read Command. Each of m MSB page Address arrangement islike single page address arrangement to begin with a unique start-codeand followed by few bytes of column address first followed by few bytesof row address or vice versa. In an embodiment of the present invention,m MSB pages of Addresses can be loaded between start and end of m-pageMLC MSB Read Commend due to that this is a m-page Read in HiNAND2 ratherthan a single page MLC MSB Read in conventional NAND.

Step 823: Similarly, this step is to perform a plurality of C_(LBL)precharging operations. But unlike the SLC Block Read Step 801 whichonly precharge m 8 KB CACHEcel C_(LBL) capacitors for one Read, thism-page MLC MSB Read needs to precharge m 8 KB pseudo CACHEcel C_(LBL)capacitors and m 8 KB CACHEint C_(LBL) capacitors with Vinh voltage (7Vup to 10V) to separately store two different MSB page data at differenttemporary registers because each MLC MSB Read requires 2 Reads for moreefficient operation. The reason of precharging to Vinh voltage wasexplained before and is omitted here. Note, each 8 KB C_(LBL)precharging, it means both 4 KB Even and 4 KB Odd C_(LBL) pseudocapacitors in one physical WL are concurrently precharged.

Due to more CACHEcel and CACHEint selections for m-page MLC MSB Read,the bias conditions are set different from m-page SLC Read operation.Associated with Step 823, a set of bias conditions in accordance withHiNAND2 array circuit shown in FIG. 1A such as:

-   -   a) TIE=H1, DIVen=0V, CSL=0V, SEGo=SEGe=0V (two sets for two        CACHEs) TIE=H1≧Vinh+Vt is to turn on MLBLb NMOS transistor        between CACHEcel and CACHEint so that CACHEcel and CACHEint can        be jointly precharged with Vinh via two different PREo and PREe        transistors.        -   SEGo=SEGe=0V of two Segments are to prevent two sets of one            paired 4 KB Odd and 4 KB Even C_(LBL) in CACHEcel and            CACHEint from leakage to one commonly shared corresponding 4            KB metal2 GBLs.        -   CSL=0V is a regular set up for a normal NAND string Read            operation.    -   b) PREo=PREe=H1 (two sets):        -   Two sets of PREo and PREe are to turn on both two sets of            MLBLso and MLBLse transistors so that Vinh can be coupled            from one common selected LBLps line to two selected 8 KB            C_(LBL) capacitors of CACHEcel and CACHEint.    -   c) LBLps=Vinh is supplied by a central Vinh MHV pump circuit.

The m 8 KB CACHEcel and m 8 KB CACHEint C_(LBL) precharge-time iscontrolled by a self-timed LBLps Vinh Detector circuit. This is done byusing one shared LBLps line as a Vinh supply line as well as a Vinhsensing line. The Vinh supply comes from one end LBLps connected to aVinh Driver but Vinh Detector operates at another end of LBLps line.Once the LBLps line reaches Vinh, it means 2 m 8 KB C_(LBL) capacitorsin both CACHEcel and CACHEint are full with Vinh so that Vinh Detectorwill issue a signal to on-chip State-machine to stop Vinh prechargeoperation. This Vinh precharge time thus can be very accurately andautomatically controlled even for two CACHE precharging.

Step 824: Similarly, this decision step is to check if XT bus line isfree for the remaining unfinished MLC MSB Read Segments. If theseSegments are occupied by other concurrent operations, then this MSB Readwill loop to wait. If not, then the unfinished Segment's pages areselected for next WL precharge step.

Step 825: Note, although two pseudo CAHCEs, CACHEcel and CACHEint, areselected for precharging and latching but only one set of64WLs+1SSL+1GSL lines in CACHEcel are selected for concurrentprecharging and latching with VR1 (1 selected WL), Vread (63 unselectedWLs), Vdd (1 SSL) and Vss (1 GSL). The 64WLs+1SSL+1GSL lines in CACHEintare disabled by coupling to Vss.

The self-timed Vread Detector as previously explained in a SLC Read isalso used to automatically and accurately control each WL-prechargingand latching period with same reason explained before. Similarly, thedummy-WL Vread Detector will generate a signal to initiate a 2-step WLs,SSL, and GSL voltage latching process once Vread is detected. Thedesired WL related bias voltages are identical SLC's ones, thus it isomitted here for the description simplicity.

After latching VR1 and Vread, the common bus lines of 64 XTs and 1 SSLpand 1 GSLp are released for other possible concurrent m-page operationsbefore next VR2 read. During the interval between VR1 and VR2 reads,there is about 5-10 μs available time to do other newly added concurrentreads or programs with a need to separately and independently prechargenewly addressed WLs and BLs at different Segment Addresses. Note, theavailability of 64 XT bus lines can be provided by XT Vpass or VreadDetector. It just need two XT Detectors as Vread Detector made of2-input DA.

Step 826: This step again is also a self-timed C_(LBL)'s Vinhdischarging and retaining operation in accordance with one of m selected8 KB MLC MSB bit patterns within CACHEcel register as explained inprevious Step 805 of m-page SLC Read operation.

For saving MSB Read time and WL stress at Vread, one 4 KB Even and 4 KBOdd cells within 8 KB CACHEcel are selected in accordance with that onlyone set of 64WLs+1SSL+1GSL lines is selected for precharging inCACHEcel. It is intended to perform similar single-page All-BL Read asin conventional NAND. But due to the preparation of two VR1 and VR2reads for this MSB Read, both CACHEcel and CACHEint are eitherdischarging or retaining Vinh at the same time by using TIE=H1 toconnect them. In other words, the discharging operation only happens onCACHEcel capacitors but due to the full connection between CACHEcel andCACHEint via the selected MLBLb transistor in conduction state, thefinal voltage of each bit of CACHEint C_(LBL) capacitors is equal toeach corresponding bit of CACHEcel C_(LBL) voltages.

Eventually this m-page MLC MSB Read will perform m All-BL Read in themanner of pipeline manner. One MLC-WL by one MLC-WL is sequentiallyselected for m-page MLC MSB Read at different time line, thusdischarging and retaining of Vinh is done one MLC-WL by one MLC-WL here.In order to control C_(LBL) discharge time of m-page MSB Read for bothCACHEcel and CACHEint automatically and accurately, only one VLBLDetector is required with a similar bias conditions and final desired WLvoltages: 64 WLs=SSL=GSL=0V after discharge. The preferred set of biasconditions is:

-   -   a) TIE=H1, This condition is to equalize the final C_(LBL)        voltages in both CACHEcel=CACHEint.    -   b) DIVen=CSL=SEGe=SEGo=0V to prevent no leakage to common GBLs.    -   c) PREe=PREo=0V and LBLps=0V. Since it is discharging step, the        precharge transistors and Vinh power line of MLBLe and MLBLo        have to be disconnected.

Again, Step 825 takes a shorter time to discharge m 8 KB C_(LBL)capacitors from Vinh to Vss than prior-art due to the Segment C_(LBL) ofboth CACHEcel and CACHEint are identical and short with a lighter valueof C_(LBL), which is C_(LBL)=2/(L×J) C_(BL), of the conventional longBL. Note, factor 2 is due to that two C_(LBL) capacitance needs to bedischarged which is 2-fold of SLC Read discharge because only oneC_(LBL) capacitance needs to be discharged.

Note, although the first selected MLC MSB Read is ½-page 4 KB Even MSBpage but the discharge of LBLs is done in one full physical MLC-WL thatcontains both 4 KB Even and 4 KB Odd pages concurrently because sharingthe same MCL-WL. In this manner, 2× MSB Read can be achieved because oneof the key MSB Read bottleneck is the LBL discharge time.

In conclusion, after this step, both CACHEcel and CACHEint storing thesame 4 KB Even and 4 KB Odd Vinh/Vss analog pattern in accordance with 8KB stored MSB data in each physical MLC-WL. After VR1 first Read, thevoltages of all selected 64WLs+1SSL+1GSL lines are kept without beingdischarged so that the next VR2 Read becomes quicker and less powerconsumption without a need to precharge same 64 WLs and 1 SSL and 1 GSLagain as explained in Step 827 below.

Step 827: This step is to perform a LSB VR2 Read (equivalent to MSBRead) following the LSB VR1 Read for the same m selected MLC-WLs. Butbecause no discharge on those selected 64 WLs and SSL and GSL lines inVR1 Read, thus precharge only needs to do one selected WL per one 64-WLBlock. Each selected WL VR1 voltage stored in VR1 Read is replaced byVR2 voltage in this LSB VR2 Read operation. This step has to do one64-WL Block by one 64-WL Block basis. For m-page MLC LSB VR2 Read, thentotally, it needs to do m times of VR2. The VR2 precharge step comprisesfour sub-steps as listed in the following Table 7.

TABLE 7 VR2 precharge for m-page MLC LSB Read WLs, SSL and Keep HXD = 0V But Voltages setting under GSL Trapped new set of voltages HXD ≧Vread + Vt to 64 WLs, 1SSL and voltages under are applied to 64 connect64WLs, 1SSL 1GSL voltages HXD = 0 V & XTs, 1SSLp & and 1GSL to 64XTs,latching under VR1 Read 1GSLp 1SSLp & 1GSLp HXD = 0 V WL(sel) = VR1XT(sel) = VR2 WL(sel) = VR2 WL(sel) = VR2 63 WLs(un-sel) = 63 XT(un-sel)= 63 WLs(un-sel) = 63 WLs(un-sel) = Vread Vread Vread Vread SSL = VddSSLp = Vdd SSL = Vdd SSL = Vdd Vread GSLp = Vread GSL = Vread GSL =Vread

The above voltage latching is done by a self-timed Vread Detector'sinstruction.

Step 828: Similarly, this decision step is to check if XT bus line isfree for the remaining unfinished MLC MSB Read Segments. If theseSegments are occupied by other concurrent operations, then this MSB Readwill be idle to wait. If not, then the unfinished Segment's pages areselected for next WL precharge step.

Step 829: Note, although two pseudo CAHCEs, CACHEcel and CACHEint, areselected for precharging and latching but only one set of64WLs+1SSL+1GSL lines in CACHEcel are selected for concurrentprecharging and latching with VR2 (1 selected WL), Vread (63 unselectedWLs), Vdd (1 SSL) and Vss (1 GSL). The 64WLs+1SSL+1GSL lines in CACHEintare disabled by coupling to Vss.

Step 830: This is the second self-timed C_(LBL)'s Vinh discharging andretaining operations under VR2 setting of MSB Read in accordance withone of m selected 8 KB MLC MSB bit patterns within CACHEcel register asexplained in previous Step 826 of same flow of m-page MLC MSB Read underdifferent biased condition to ensure the previous 8 KB VR1 MSB page datastored in 8 KB CACHEint are not destroyed by this second VR2 Read. Thisis done by applying the new set of bias conditions shown below:

-   -   a) TIE=0V.        -   This condition is to disconnect CACHEcel from CACHEint. As a            result, the last 8 KB VR1 bit values are stored in 8 KB            CACHEint but the new 8 KB VR2 bit values are separately            stored in CACHEcel.    -   b) DIVen=CSL=SEGe=SEGo=0V to prevent no leakage to common GBLs.    -   c) PREe=PREo=0V and LBLps=0V.        -   Since it is discharging step, the precharge transistors and            Vinh power line of MLBLe and MLBLo of both CACHEcel and            CACHEint have to be shut off.

Again, VR2 Read Step 830 takes one-half time to discharge m 8 KB C_(LBL)from Vinh to Vss in CACHEcel of VR1 Read in Step 826 because only oneC_(LBL) capacitance in CACHEcel to be discharged. In other words, theVR2 discharge time is 2-fold faster of VR1 discharge time. Similarly, aVLBL Detector in LBLps line can also be used to automatically andaccurately control the VR2 discharge time as explained in VR1 Read. Thusthe detailed VR2 time control is omitted here for description brevity.

In conclusion, after this step, the following voltages are latched by aself-timed VLBL Detector's instruction:

-   -   a) CACHEcel=MSB data under VR2 in accordance with E/A-cell=0B,        but B/C-cell=Vinh,    -   b) CACHEint=MSB data under VR1 in accordance with E-cell=0B, but        A/B/C-cell=Vinh,    -   c) All 64 WLs=SSL=GSL=0V.

Step 831: This step is to discharge the set of voltages of selected WLs,unselected WLs, SSLs and GSLs of m selected blocks after the completionof VR2 concurrent MSB read operation by a self-timed duration by settingthe following bias conditions.

-   -   a) ENB=1=Vdd,    -   b) CLA=CLR=ENS=0V,    -   c) XT1 to XT64=SSLp=GSLp=0V

Step 832: This step uses the 4 KB Multiplier perform a first analogamplification of 4 KB Even LBL sensed voltages read from thecorresponding 4 KB Even CACHEcel and uses 4 KB corresponding SA toperform a second analog to digital amplification. The resultant 4 KBamplified data are stored in 4 KB SAs in accordance with the sensed datavia GBL/LBL charge-sharing operation. For example, the amplified bitdata is Qi=0/1, when the corresponding sensed voltages is 0V/Vinh beforeGBL/LBL charge-sharing solution.

Step 833: This step is the decision step to check the status of Flagcell. If Flag cell data is “0’ not “1”, which means the MLC-WL storesboth LSB and MSB pages data. Then the step moves to Step 850. If Flagdata is “0”, which means MLC-WL only stores one MSB page data stored in4 KB CACHEint under previous concurrent VR1 read. Then, it moves to Step834.

Step 834: This step is like Step 832 but to read the SLC data from 4 KBCACHEint storing the VR1 read analog data in the voltage form ofVss/Vinh. Similarly, it also uses the 4 KB Multiplier perform the firstanalog amplification of 4 KB Even LBL sensed voltages read from thecorresponding 4 KB Even CACHEcel and uses 4 KB corresponding SAs toperform the second analog to digital amplification. The resultant 4 KBamplified data are stored in 4 KB SAs in accordance with the sensed datavia GBL/LBL charge-sharing operation. For example, the amplified bitdata is Qi=0/1, when the corresponding sensed voltages is 0V/Vinh beforeGBL/LBL charge-sharing dilution.

Step 835: It is a decision step to check if 4 KB CACHE register isavailable to receive new data without data contention? If “No” isdetected, then the flow waits. If “Yes” is detected, then the flow movesto Step 836 to set RDY=0V to inform off-chip Flash controller or Hostthe NAND is currently entering into a busy state which means any newdata cannot be loaded into NAND's real CACHE register temporarily.

Step 837: The DIOi and DIOiB of each bit of CACHE register arepre-equalized to ½ VDD first before receiving the bit data from eachcorresponding SA bit by setting the following conditions:

-   -   a) RD=LD=Vss, but EQ=1=Vdd. As a result, DIOi=DIOiB=1/2Vdd for        safer data transferring between each P/RB and CACHE bit.    -   b) All Y-pass gates=Vss to isolate all 4 KB CACHE from I/Os.

Step 838: The 4 KB SLC data in 4 KB SAs are then transferred to 4 KBcorresponding CACHE simultaneously in 1-cycle by setting the followingconditions to connect 4 KB SAs to 4 KB CACHEs via 4 KB NMOS devices oftransistor 21.

-   -   a) LAT=1,    -   b) EQ=RD=PGM=Vss,    -   c) but WRT2=LAT=LD=H1,    -   d) All Y-pass gates=Vss to isolate all 4 KB CACHE from I/Os.        For example, CACHS's DIOi=0/1 when the corresponding SA's        QiB=1/0.

Step 839: It is to shut off the connections between 4 KB SAs and 4 KBCACHE and latch the data in 4 KB CACHE by setting the followingconditions:

-   -   a) EQ=RD=PGM=Vss,    -   b) WRT2=LAT=LD=Switching to Vss,    -   c) All Y-pass gates=Vss to isolate all 4 KB CACHE from I/Os.

Step 840: Before moving to Step 841 to start outputting 4 KB even MSBpage data to I/Os, this step resets RDY=1 to inform the off-chip FlashController.

Step 841: This step starts to output 4 KB Even MSB page data tobyte-wide I/O in unit of byte sequentially.

Step 842: While 4 KB Even MSG data being outputted at Step 841, the 4 KBOdd MSB data should be read out from 4 KB Odd CACHEint to 4 KB CACHEagain via the identical steps of 834 to 840.

Step 843: This step starts to output 4 KB Odd MSB page data to byte-wideI/Os in unit of byte sequentially and then moves to Step 883 to checkthe next MSB page.

In another embodiment, m-page MLC Read operation includes an alternativeflow as shown below. This is based on Flag=0 status that indicates theaddressed MLC-WL cells do store 4-Vt with both MSB and LSB bits.Therefore, FIG. 8E flow continues to do a third VR3 Read with Even andOdd MSB data under VR2 read stored in the 8 KB CACHEmsb pseudoregisters.

In order to read a 4-Vt MLC cell from MLC-WLs, total three reads of VR1,VR2, and VR3 are required to obtain both MSB bit and LSB bit wheneverFlag=0 is detected. Once these three sets of 8 KB MLC data are obtained,then both MSB and LSB bit flipping operations can be done accordingly.After that, the 4 KB MLC Even and 4 KB Odd MSB and LSB data should besequentially clocked out to the off-chip Flash microcontroller viaHiNAND2's on-chip real 4 KB CACHE. Again, 4K cycles are needed to clockout respective Odd and Even LSB bits.

FIG. 8F is a flow chart showing a method for performing m-page MLC Readoperation according to an alternative embodiment of the presentinvention. The steps of this flow mainly are to demonstrate how to read,sense and amplify the analog signals of m pages of Even and Odd 4-Vt MLCcells that store both MSB and LSB bits when a Flag cell=0 is detectedand confirmed. Both MSB and LSB bit data of all m 8 KB pages are finallytransferred from m selected 8 KB physical WLs in m 8 KB pseudo CACHEcelregisters via 4 KB metal2 GBLs, then one 4 KB Multiplier, then one 4 KBSA, then lastly 4 KB real CACHE registers waiting for I/Os availabilityin next flow (FIG. 8G).

As shown in FIG. 8F, the method starts with Step 850 which continuesStep 833 after finding Flag cell=0, thus it confirms each read physicalMLC cell in MLC-WL indeed stores 4 Vts of both MSB bit and LSB bit.Thus, the concurrent MLC read has to be continued and the 4 KB Even MSBdata is still needed for the subsequent logic operation for LSBconcurrent read to obtain the correct data of B and C states beforebeing sequentially clocked out to I/Os. As a result, the 4 KB Even MSBdata stored in 4 KB SAs is loaded and latched back to the designatedCACHEmsb at this step. This is something like DRAM's write-backoperation. One difference is that before 4 KB even MSB write-back, the 4KB even CACHEmsh has to be precharged with Vinh so that the nextiterative MLC read, the signal is strong enough to sustain the CSeffect.

The writeback bias conditions are summarized below:

-   -   a) SEGe (msb)=1=Vdd but SEGo (msb)=0V for CACHEmsb Segment only,    -   b) TIE=SEGo=0V,    -   c) DIVen=BIAS=WRT2=H1 one shot pulse.        -   BIAS=H1 to turn on MN61,        -   WRT2=H1 to turn on MN22,        -   DIVen[1] to DIVen[J-1]=H1 to turn on all broken 4 KB metal2            J-1 MGBL transistors along the path of selected one of m 8            KB CACHEmsb to DB when m pages are distributed in J Groups.

These H1 conditions are to connect each SA's QiB digital Vdd/Vss datadirectly to each corresponding even CACHEmsb C_(LBL) capacitor via eachGBL with a reversed polarity without a voltage drop in accordance withthe SA circuit shown in FIG. 3. The reason of reversing bit data isbecause the MSB bit data is readout from MSB cells that have an oppositepolarity to MSB page data loaded from 8 I/Os.

Step 851: This decision step is to check if the on-chip 4 KB real CACHEare occupied by any other on-going concurrent operation? Note, this stepcan be performed simultaneously with Step 850 because no bus contentionexists between SAs to CACHEmsb and real CACHE. If the Step 851 yieldsNo, then Step 851 is looped to wait for the availability of 4 KB realCACHE. If the Step 851 yields Yes, then flows moves to Step 852 and oneof 4 KB Even MSB page data are ready to be sequentially clocked out toI/Os from 4 KB P/RB via 4 KB real CACHE.

Step 852: HiNAND2 will issue a busy notice by pulling down RDY pin=Vssto allow the off-chip Host or Flash Controller to sequentially clock outeach 4 KB Even MSB page data stored in 4 KB P/RB done in previous steps.

Step 853: Before taking 4 KB Even MSB data from 4 KB P/RB, m 4 KB realCACHE bit data are equalized first so that DIOi=DIOiB=½ Vdd inaccordance with the following bias conditions:

-   -   a) RD=LD=0V and All Ypass=0V.        -   This condition ensures the equalization of each pair of DIOi            and DIOiB would not be affected by outputs of each P/RB and            I/Os.    -   b) LAT=EQ=1=Vdd.        -   To start performing DIOi and DIOiB equalization.

Step 854: This step is to transfer 4 KB Even MSB page digital data to 4KB corresponding real CACHE registers by keeping the same polarity suchas each DIOi=0/1 of each real CACHE if Di=1/0 in each corresponding P/RBbit with the following conditions:

-   -   a) WBK=IDAB=IDB=IDC=ENSB1=ENSB2=INV=RD=0V,        -   To prevent each P/RB bit being wrongly set by four pull-down            current paths below (referring to FIG. 3):        -   Path1 at Di node: Transistor 16 (gate=WBK=0V) and Transistor            17,        -   Path2 at Di node: Transistor 8 (gate=IDB=0V) and Transistor            9,        -   Path3 at Di node: Transistor 26 (gate=INV=0V) and Transistor            27,        -   Path4 at DiB node: Transistor 19 (gate=IDC=0V) and            Transistor 18,    -   b) PGM=0V,        -   To prevent each P/RB output node from each corresponding GBL            because this is not a program step.    -   c) RDRD=LD=0V and All Ypass=0V        -   Ypass=0V is to ensure the data transferring between each            P/RB is not affected by 8 I/Os.

Step 855: This step is to latch and isolate 4 KB Even MSB page digitaldata in 4 KB real CACHE from 4 KB P/RB in accordance with the followingadded conditions: LAT=RD=Falling edge pulse.

Step 856: Once 4 KB Even MSB Data is ready at 4 KB real CACHE, then setRDY=1 to indicate 4 KB even MSB page data ready to be clocked out.

Step 857: The step shows the 4 KB Even MSB Data in 4 KB real CACHE isbeing sequentially clocked out via 8 I/Os by Flash Controller or Host.It will take 4K cycles, which a long lengthy time of around 40 μs ifone-edge clock scheme is used with a 10 ns clock rate. During this step,the 4 KB Even MSB data clocking out only happens between 4 KB real CACHEand I/Os. All other internal buses such as GBL bus, XT bus, and DB arenot affected so that they can be used for other concurrent operations.

Step 858: While sequentially clocking out the 4 KB Even MSB page data at8 I/Os, this step is also concurrently conducting another 4 KB Odd MSBsensing, flipping and transferring to 4 KB P/RB via 4 KB GBLs linesbecause no bus contention issue. This can be done by repeating the stepsof 832, 850-856.

Step 859: The step is to sequentially clock out the remaining 4 KB OddMSB Data in 4 KB real CACHE to Flash Controller or Host via 8 I/Os. Itwill take same 4K cycles, which another long lengthy time of around 40μs.

Now, after completion 8 KB MSB whole page read and outputting, the flowmoves to Step 860 to repeat the same process of reading m MLC's LSBpages.

Step 860: After finishing the clocking out both 4 KB Even MSB and 4 KBOdd MSB page data to I/Os plus loading and latching back to 4 KB Evenand 4 KB Odd MSB page to 8 KB CACHEmsb, this step continues to perform apreferred self-timed Vinh precharging only on m 8 KB CACHEcel C_(LBL)capacitors preparing for the next 8 KB LSB B-state read and logicoperations of 8 KB MLC cells. The preferred bias conditions aresummarized below:

-   -   a) TIE1˜TIEL/2=0V (L=4),    -   b) LBLps1=Vinh and PREe1=PREo1=H1 (for CACHEcel),    -   c) DIVen=CSL=0V, SEGe1=SEGo1=0V,    -   d) Other LBLps=PRE=PRE=0V (for CACHEint, CACHE1sb, CACHEmsb).

This step is like the previous precharging steps, thus the details areomitted herein for description simplicity.

Step 861: Similarly, this decision step is to check if the common XT buslines of 64WLs+1SSLp+1GSLp are occupied by any concurrent m-pageoperation. If XT bus lines are not free, it means XT bus lines areoccupied. Then this check step is looped to wait for the completion. IfXT bus lines are found free, then the next new 4 KB Even B-state LSBpage Read can be started by moving to Step 862.

Step 862: One set of 64WLs+1SSL+1GSL lines in CACHEcel are selected forconcurrent similar precharging and latching with VR3 (1 selected WL),Vread (63 unselected WLs), Vdd (1 SSL) and Vss (1 GSL). A self-timedVread Detector as explained in a SLC Read is also used to automaticallyand accurately control each WL-precharging and latching period with samereason explained beforehand. Similarly, the dummy-WL Vread Detector willgenerate a signal to initiate a 2-step WL, SSL, and GSL voltage latchingprocess once Vread is detected. The desired WL related bias voltages areidentical to SLC's ones, thus it is omitted here for the descriptionsimplicity.

After latching the corresponding VR3 and Vread, then the common buslines of 64XTs+1SSLp+1GSLp are released for other possible concurrentm-page operations and eliminate WL Vread stress for longevity of NANDcells.

Step 863: This novel step again is also a self-timed 8 KB C_(LBL)'s Vinhdischarging and retaining operations in accordance with one of mselected 8 KB MLC cell's stored states. With selected WL=VR3, CACHEcelat Vinh is for C-cell, because Vtc>VR3, and CACHEcel at Vss for E-cell,A-cell, and B-cell, because Vtemax<Vtamax<Vtbmax<VR3. In this step, aLBLps Differential Amplifier is used and is connected to LBLps1 withVref=1.0V per one Segment. The preferred bias conditions are listedbelow:

-   -   a) CSL=DIVen=TIE1=TIE2=0V,    -   b) SEGo1=SEGe1=0V,    -   c) PREe1=PREo1=0V,    -   d) LBLps1=0V,    -   e) Others=0V.

Step 864: The self-timed concurrent discharge operation of m sets of64WLs+1SSL+1GSL lines is initiated once one or more CACHEcel C_(LBL)capacitors are fully discharged to 1.0V or retaining Vinh in accordancewith 8 KB MLC cells' states. As explained before, the dischargedetection voltage is set to be 1.0V by one LBLps DA per one Segment. Thepurpose of this step is to reduce m 64-WL gate Vread disturbance on m 8KB MLC cells. The preferred sets of bias conditions are listed below:

-   -   a) ENB=1, CLR=ENS=CLA=0,    -   b) XT1˜XT64=SSLp=GSLp=0V,    -   c) CLWL=one-shot pulse of Vdd.

The reason to set condition b) above for XT bus lines is because thedischarge of 64WLs+1SSL+1GSL lines have to go through the discharge of64XTs+1SSLp+1GSLp common vertical bus lines in accordance with the LatchFlag status of Block-decoder circuit shown in FIG. 2A to set HXD node toVdd. Each Flag latch includes two inverters INV3 and INV4 to work withthe selected address stored in Address Buffer and self-timed controlcircuit (not shown).

Step 865: This step is to sense and amplify the 4 KB Even LSB C-statepage data from 4 KB Even CACHEint to 4 KB SA via 4 KB Multipliers underSelected-WL=Vtamin with same polarity. In other words, SA's digitalQi=0/1 when each analog C_(LBL)=Vss/Vinh in CACHEint. As a result,Qi=0,1,1,1 for the corresponding states of E,A,B,C per each SA. Thepreferred bias conditions are listed below:

-   -   a) CSL=0V,    -   b) DIVen=SEGe2=H1 (CACHEint), SEGo2=0V,    -   c) SEGe1=SEGe4=0V (CACHEcel and CACHEmsb),    -   d) TIE1˜TIEL/2=0V (L=4), T5=one shot Vdd,    -   e) Voutp=Vref+/−AV.

Step 866: This step is to transfer the 4 KB Even LSB C-state page datafrom 4 KB SAs to 4 KB P/RB with the same polarity. In other words, eachP/RB's digital Di=0/1 when each SA's digital Qi=0/1. As a result, eachP/RB's Di=0,1,1,1 for the corresponding states of E,A,B,C as each SA.The preferred bias conditions are listed below:

-   -   a) INV=IDB=IDAB=0V,        -   INV=0V to block Di path to Vss via Transistor 26 (referring            to FIG. 3),        -   IDB=0V to block Di path to Vss via Transistor 8,        -   IDAB=0V to block Di path to Vss via Transistor 6.    -   b) IDC=WBK=One shot Vdd        -   This is to enable one paired paths to Vss from Di and DiB            when Qi=1.    -   c) ENSB1=ENSB2=0V,        -   Due to MLSB=MLSBB=X.    -   d) PGM=0V,        -   To ensure the output or P/RB at PBLP would not affect GBL.    -   e) T5=1 to enable SA.

Step 867: This step is to sense and amplify the 4 KB Even MSB page datafrom 4 KB Even CACHEmsb to 4 KB SA via 4 KB Multipliers underSelected-WL=Vtbmin with same polarity. In other words, SA's digitalQi=0/1 when each analog C_(LBL)=Vss/Vinh in CACHEint. As a result,Qi=0,0,1,1 for the corresponding states of E,A,B,C per each SA. Thepreferred bias conditions are listed below:

-   -   a) CSL=0V,    -   b) DIVen=SEGe4=H1 (CACHEmsb), SEGo4=0V,    -   c) SEGe1=SEGe2=0V (CACHEcel and CACHEint),    -   d) TIE1˜TIEL/2=0V (L=4), T5=one shot Vss,    -   e) Voutp=Vref+/−AV.

Step 868: This step is to transfer the 4 KB Even MSB page data from 4 KBSAs to 4 KB CAP1 with same polarity. In other words, each P/RB's digitalDi=0/1 when each SA's digital Qi=0/1. As a result, each P/RB'sDi=0,1,1,1 for the corresponding states of E,A,B,C as each SA. Thepreferred bias conditions are listed below:

-   -   a) INV=IDB=IDAB=0V,        -   INV=0V to block Di path to Vss via Transistor 26,        -   IDB=0V to block Di path to Vss via Transistor 8,        -   IDAB=0V to block Di path to Vss via Transistor 6.    -   b) IDC=WBK=One shot Vdd

This is to enable one paired paths to Vss from Di and DiB when Qi=1.

-   -   c) ENSB1=ENSB2=0V,

Due to MLSB=MLSBB=X.

-   -   d) PGM=0V,

To ensure the output or P/RB at PBLP would not affect GBL.

-   -   e) T5=1 to enable SA.

Step 869: This step is to sense and amplify the 4 KB Even C-state pagedata from 4 KB Even CACHEcel to 4 KB SA via 4 KB Multipliers underSelected-WL=Vtcmin with same polarity. In other words, SA's digitalQi=0/1 when each analog C_(LBL)=Vss/Vinh in CACHEint. As a result,Qi=0,0,0,1 for the corresponding states of E,A,B,C per each SA. Thepreferred bias conditions are listed below:

-   -   a) CSL=0V,    -   b) DIVen=SEGe1=H1 (CACHEcel), SEGo1=0V,    -   c) SEGe2=SEGe4=0V (CACHEint and CACHEmsb),    -   d) TIE1˜TIEL/2=0V (L=4), T5=one shot Vss,    -   e) Voutp=Vref+/−AV.

Step 870: This step is to do the bit flipping in accordance with thetable shown in FIG. 8H and DB circuit shown in FIG. 3 and the followingpreferred bias conditions:

-   -   a) INV=IDAB=0V,        -   INV=0V to block Di path to Vss via Transistor 26,        -   IDB=One shot to enable Di path to Vss via Transistor 8,        -   IDAB=0V to block Di path to Vss via Transistor 6.    -   b) IDC=WBK=One shot Vdd,        -   This is to enable one paired paths to Vss from Di and DiB            when Qi=1.    -   c) ENSB1=ENSB2=0V, Due to MLSB=MLSBB=X.    -   d) PGM=0V,        -   To ensure the output or P/RB at PBLP would not affect GBL.    -   e) T5=1 to enable SA.

More details of the bit flipping are given as: SA=C-state, thusQi=0,0,0,1, thus QiB=1,1,1,0 respectively for E,A,B,C-states;CAP1=B-state, thus CAP1=0,0,1,1 for E,A,B,C-states; P/RB=A-state, thusDi=0,1,1,1 for E,A,B,C-states. When QiB AND CAP1=0,0,1,0 to flip P/RB,then only the third bit=1 from left to flip P/RB. As a result,P/RB=0,1,1,1 is converted to 0,1,0,1 as seen in FIG. 8H. In conclusion,after this step, whole three A, B, and C-state Block Reads of 4 KB MLCLSB page data operations are done and ready to be transferred to 4 KBreal CACHE, and then 8 I/Os. The Read operation flow moves to Step 871in FIG. 8G below.

FIG. 8G is a flow chart showing a method for performing m-page MLC Readoperation according to an alternative embodiment of the presentinvention. As shown, this flow continues from the previous m-page 4-VtMLC Read operation flow at Step 870. The top part shows the steps ofsynchronous outputting both 4 KB Even 4-Vt MLC LSB page data in 4K clockcycles and 4 KB Odd MLC LSB page data in another 4 KB clock cycles from4 KB shared on-chip real CACHE to off-chip Flash controller via 8 I/Osunder a condition of Flag cell=0. The bottom flow shows only 1-byte of 8LSB bits of 1-byte 2-Vt MLC cells are set to be 1, e.g., LSB=1representing all 4 KB LSB bits=1 because no LSB bit data is programmedinto each MLC Even physical cell. In other words, each MLC cell has 2Vts of MSB logic bit only, rather than 4 Vts containing both MSB and LSBlogic bits.

Step 871: In previous Step 870, the desired 4 KB MLC LSB Even page dataafter A, B, and C-state Read and bit-flipping are stored at 4 KB P/RBand ready to be shifted to on-chip 4 KB real CACHE. As explained before,step 871 checks to see if 4 KB real CACHE is occupied and not freed byother existing operations besides this m-page MLC Read. If the checkshows “Not free”, then this step is looped to wait for the cleanup of 4KB real CACHE. If the check shows “Yes it is free”, then the flow movesto Step 872.

Step 872: As explained before, HiNAND2 will set RDY=0 to indicate itbecomes BUSY to do the sequential transferring of 4 KB Even LSB pagedata from 4 KB P/RB to 4 KB real CACHE. The 4 KB real CACHE is nowtemporarily occupied.

Steps of 873, 874, 875, 876, and 880: These 5 steps are like previousSteps of 837, 838, 839, 840, and 841 depict the data transferring startsby equalizing each paired DIOi and DIOiB of each real CACHE bit by usingLAT=EQ=1=Vdd, and then transferring is done by setting LAT=1=Vdd andRD=H1 and latching in CACHE is done by using the falling edge of signalsof one-shot of both LAT and RD. Lastly, at Step 880, the 4 KB Even MLCLSB page data are clocked out sequentially in 4K cycles.

Next, Steps 881 and 882 are like Steps 842 and 843 to repeat thetransferring and clocking out of the remaining 4 KB Odd MLC LSB pages.The details of these steps are omitted hereby for description brevitywithout limiting the scope of claims.

Step 883: There are two inputs into the Step 883. The first input isfrom Step 882. In this case, this decision step 883 checks if the lastpage of 8 KB MLC LSB being clocked out. If “Yes”, then this preferredm-page 8 KB MLC LSB Read flow is ended at Step 885. Otherwise, the flowcontinues to repeat the remaining MLC LSB page Read as indicated by Step884. The second input is from Step 843. In this case, this decision step883 also checks if the last page of 8 KB MLC MSB Read is clocked out. If“Yes”, then this preferred m-page 8 KB MLC MSB Read flow is ended atStep 885. Otherwise, the flow continues to repeat the remaining MLC MSBpage Read.

Step 843: This step concludes as long as at least one page of MLC MSB orMLC LSB page data is not clocked out yet to I/Os, then this m-page MLCRead operation will be continued. Then, the flow moves to Step 823 whichis to perform LBL precharge as to repeat the whole Even and Odd page LSBRead operations again for next remaining pages of this preferred m-pageMLC Read. Later, the flow will come to Step 883 again until no morepages left for this preferred m-page MLC Read. Eventually, the flow willend at Step 885.

FIG. 8H is a table of MLC cell logic states for MLC Read operation ofEven Page with Flag cell=0 according to an embodiment of the presentinvention. This table shows the detailed steps of a B′-state bit dataadjustment. The each B′-cell adjustment is done in each correspondingP/RB by each the stored MSB and LSB logic values in each correspondingSA. Totally, 10 sub-steps of the logic values of “1’ and “0” at SA andP/RB assigned for four interim MLC Vt states of E, A, B and C cells aresummarized for a 4-Vt MLC program from beginning to the end.Specifically, the detailed sequences of bit-flipping steps are shownduring this preferred m-page MLC Read operation of 4 KB Even Page. Thesame table can also be used as a guideline for MLC Read of 4 KB LSB OddPage data.

As seen on top of Table, the bit-flipping of m-page MLC Read (regardlessof Even and Odd LSB page) is preferably performed between the 4 KB SAsoutput pairs of Qi and QiB, the temporarily pseudo registers of CAP1,CAP2, and the corresponding 4 KB P/RBs with same or different sizes inaccordance with the circuit shown in FIG. 3. Throughout thespecification, we use same size of 4 KB SAs, CAP1, CAP2, and SAs andP/RBs for illustration purpose without any limiting of the scope of theclaims.

All terminologies used for this Table and features of the m-page MLCRead operation are explained below:

-   -   I. @SA: At each SA.        -   It indicates what is each MLC cell's interim and final Read            data for respective states of E, A, B1′, B2′ and C stored at            each SA bit.    -   II. @P/RB: At each P/RB.        -   It indicates what is each MLC Read's interim and final bit            data for respective states of E, A, B1′, B2′ and C stored at            each P/RB.    -   III. “X”:        -   It means “Don't-care” initial state that can store either            “1” or “0”.    -   IV. The Vt distribution of four final states of E, A, B and C or        5 interim states of E,        -   A, B1′, B2′ and C are defined in FIG. 4.    -   V. Multiplier, SA and P/RB circuits are shown in FIG. 3.    -   VI. The HiNAND2 array circuit is shown in FIG. 1A.    -   VII. The scheme of MLC Even and Odd Block Read are same.    -   VIII. The bit flipping and logic operations are performed at        each P/RB but controlled by each final data bit at SA and CAP1        and CAP2 (see FIG. 3).

Referring to FIG. 8H, a) Discharge E-state cell in each CACHEint (stepsof 825 and 826 of FIG. 8E): This is a first step to sequentially readMLC cell's state from the lowest selected WL level of Vtamin within 4 KBpseudo CACHEcel. Only E-sate Vt at Vss, but A, B, C-state Vt at Vinh.This is referred as A-state Read. Then using TIE1=H1 to pass thedischarge result from each CACHEcel to each CACHEint. After that,setting TIE1=Vss to latch A-state data at CACHEint. At this moment, theSA and P/RB are not loaded yet, thus data stored in both SA and P/RB are‘X’ state.

b) Discharge E/A-state cell in each CACHEcel (steps of 829 and 830 ofFIG. 8E): This is a second step to sequentially read MLC cell state fromA-state to B-state by raising the selected WL voltage from Vtamin toVtbmin within 4 KB pseudo CACHEcel only under TIE1=0V. As a result, onlyE=A=Vss, but B=C=Vinh. This is referred as B-state Read. At this moment,the SA and P/RB are still not being loaded yet, thus data stored in bothSA and P/RB are ‘X’ state. The B-state read is same as MSB bit read.After this step, 4 KB MLC page data is loaded into 4 KB SA with same bitpolarity.

c) Read (or restore) 4 KB Even MSB data from 4 KB SAs to I/Os via 4 KBreal CACHE (steps of 832 of FIG. 8E and 851-856 of FIG. 8F): As aresult, @4 KB SA=0011=E,A,B,C because Vtbmin>Vtamax>Vtemax but @4 KBP/RB=X because P/RB is not loaded with any new MLC page data yet. Atthis step, the 4 KB Even MSB page data can be clocked out sequentiallyby Flash Controller or Host.

d) Load and latch 4 KB MLC Even page from 4 KB SAs to 4 KB CACHEmsb(step 850 of FIG. 8F): This step loads and latches 4 KB MSB Even pagedata back to 4 KB Even CACHEmsb with same polarity but 4 KB P/RB are notaffected yet. As a result, @SA=0011=E,A,B,C. but @4 KB P/RB=X=E,A,B,C.

e) Discharge E/A/B cell in each CACHEcel (steps of 863 and 864 of FIG.8F): This is a third step to sequentially read MLC cell state fromB-state to C-state by raising the selected WL voltage from Vtbmin toVtcmin within 4 KB pseudo CACHEcel only under TIE1=0V. As a result, onlyE=A=B=Vss, but C=Vinh. This is referred as C-state Read. At this moment,the 4 KB C-state is temporarily stored in 4 KB Even CACHEcel. Both SAand P/RB are not affected by C-state read yet. As a result, thus datastored SA=B-state=0011=EABC and P/RB is still ‘X’ state.

f) Read (or restore) 4 KB A-state read data from 4 KB Even CACHEint toP/RB (steps of 865 and 866 of FIG. 8F): This step also transfers 4 KBSAs to 4 KB P/RB, i.e., 4 KB SA are duplicated to 4 KB P/RB. As aresult, @4 KB SA=0111=E,A,B,C and @4 KB P/RB=0111.

g) Read (or restore) 4 KB B-state read data from 4 KB Even CACHEmsb to 4KB SAs (steps of 865 and 866 of FIG. 8F): This step also transfers 4 KBreversed Even MSB page data to 4 KB SAs. As a result, @4 KBSA=1100=E,A,B,C but keeping @4 KB P/RB=0111.

f) Latch 4 KB Even B-state read data at 4 KB SAs to 4 KB CAP1 and CAP2(Step 868 of FIG. 8F): This step is done by setting both ENSB1=ENSB2=Vddto connect each SA's output Qi to each P/RB DiB input node=MLSB andanother output QiB to another P/RB Di input node=MLSBB. This step doesnot affect the contents of both SA and PRB. As a result, @4 KBSA=1100=E,A,B,C but @4 KB P/RB=0111.

g) Flip B state cell in P/RB (Step 870 of FIG. 8F): As a result, @4 KBSA=0001=E,A,B,C but @4 KB P/RB=0101.

h) Read LSB data from P/RB to I/O via real CACHE (Step 870-876 & 880 ofFIG. 8F): As a result, @4 KB SA=0001=E,A,B,C but @4 KB P/RB=0101.Because transferring 4 KB P/RB Even LSB page data to 8 I/Os does notaffect the contents of both SAs and P/RBs at all. This 4 KB outputclocking will take 4K lengthy cycles and after that the whole m-page MLCMSB and LSB Read are done.

FIG. 9A is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a Vpgm voltage by setting Vref=Vpgm inComparator with a full RC-delay tracking capability for the selected WLduring a self-timed concurrent/pipeline multi-page Program operationaccording to an embodiment of the present invention. As shown, it is aVpgm-DA (DA1) circuit comprised of 4 parts. The DA1 circuit has twoinputs. One is connected to a dummy WL in the middle of 3-WL layout. TheDA1 circuit further includes a Vref Generator for generating at leasttwo voltages such as Vpgm for programming WL and Vpgm=1.0V fordischarging WL voltages. Additionally, the DA1 circuit includes a 3-WLlayout, wherein a middle WL is selected as a dummy WL for self-timed WLprecharge-time control with an identical and tracking-distributed R andC of selected WL. In worst-case, each selected WL has one R which is onePoly resistor with a length of 8 KB cells and the C is one WL physicalcapacitance overlapping substrate associated with two adjacent WLs'parasitic capacitors. And that is why 3-WL layout in same length and themiddle WL is preferably selected for a dummy WL that would track theworst-case RC delay of selected WL in real HiNAND2 array. Furthermore,the DA1 circuit includes a dummy-WL Vpgm generator (pump) circuit.

Referring to FIG. 9A, the DA1 circuit provides WL precharging whenProgram operation starts. During SLC Program operation, then Vpgm-DAwould be enabled. Other operations such as Read and all Verifyoperations that require no Vpgm voltage. Thus Vpgm-DA is disabled.

Initially, the dummy WL is set to Vss. When Program is started, Vpgm-DAis enabled and “-” input node is set to Vpgm-ΔV, where ΔV<0.5V as avoltage margin. In some embodiments, ΔV can be 0V. Once program starts,then dummy WL is increased by Vpgm Pump circuit. Upon reaching aboveVpgm, the Vpgm-DA would generate a voltage rising from Vss to Vdd toindicate that the Vpgm on the selected WLs are detected and established.The enabling EN signal is used to latch the voltages of 64WLs+1SSL+1GSLlines on parasitic capacitors of corresponding poly lines.

Further, the DA1 circuit provides WL discharging when Program isfinished. The same Vpgm-DA can be used to perform automatic self-timedWL discharge control by setting Vref=1.0V. Typically, Vpgm is about 15Vto 25V that takes long time up to a few μs to charge up. But for the WLdischarging, the tracking of discharge time by detecting 1.0V on theselected WL is safe because it is a LV and it is approximately equal tothe whole discharge time from 15V-25V to Vss.

Furthermore, the DA1 circuit provides initiation of each IterativeProgram time, i.e., Tpgm for In-System Serial Programming (ISSP). Oncethe Vpgm voltage is reached and detected, then the formal Programstarts. But usually, the whole averaged 1-state program time is around250μs. If it is split into 10 ISSP steps, each iterative Tpgm=25μs. TheTpgm time can be made by an on-chip RC delay circuit using a MOStransistor for R (resistance) and using a HV MOS gate for C(capacitance). Tpgm=RC and is initiated when Vpgm is detected. A circuitof self-timed delay control with an EN input and an ENDIS output is justshown for illustration purpose only. There can be many different ways tomake for those skilled in the art of IC design.

FIG. 9B is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a Vpass voltage by setting Vref=Vpass inComparator with a full RC-delay tracking capability for the selected WLduring a self-timed concurrent/pipeline multi-page Read and Verifyoperations according to an embodiment of the present invention. Asshown, it is a Vread-DA (DA2) circuit. This circuit is similar toVpgm-DA circuit comprising four similar parts. The only differencebetween DA2 and DA1 is a Precharging detection voltage. For all Verifyoperations, the highest and slowest-charged voltage on each set of64WLs+1SSL+1GSL lines is Vread, regardless of Read, Program-Verify orErase-Verify operations. Thus Vread-DA can be used for all non-Programoperations.

The reason to have this independently generated Vread voltage from Vpgmvoltage is because embodiments of the present invention allow M-pageconcurrent operations. This means some selected pages can be in Programmode while some other selected pages may be in Program-Verify mode,becoming a scenario often existed for this HiNAND2 memory system withsubstantially high flexibility. Thus one Vpgm-DA cannot be shared bymany concurrent operations and one independent Vpgm-DA and anotherindependent Vread-DA are preferred, although the layouts of bothcircuits are identical. The details of Vread generation are similar tothose for Vpgm generation.

FIG. 9C is a differential amplifier (DA) circuit diagram for generating,detecting, and latching a VLBLps up to Vinh voltage for self-timedconcurrent/pipeline operations according to an embodiment of the presentinvention. As shown, it is a VLBL-DA or LBLps-DA (DA3) circuit. Unlikethat Vpgm-DA and Vread-DA are used to detect a worst-case voltage of theselected WL charging and discharging, this VLBL-DA circuit is used todetect C_(LBL) voltage charging and discharging in either 4 KB OddC_(LBL) capacitors, or 4 KB Even C_(LBL) capacitors or both 8 KB metalC_(LBL) capacitors as pseudo CACHEs. Note, N=4 KB in the descriptionswithin present specification. Since the C_(LBL) charging or dischargingis performed through each corresponding common LBLps line, thus twonames of VLBL-DA or LBLps-DA are alternately used in the specification.

In an embodiment, the VLBL-DA circuit includes a differential amplifier(DA) circuit, a Vref generator circuit, and a sensing line gate array.The DA circuit is same as that in Vpgm-DA and Vread-DA circuit shown inFIG. 9A and FIG. 9B. The Vref generator circuit is also same as thecorresponding part of the Vpgm-DA and Vread-DA circuit. The sensing lineuses the existing LBLps line made by a metal0 layer in the HiNAND2array. Thus there is no layout-overhead like above 3-WL Vpgm-DA andVread-DA. Further, the sensing gate array also uses the existing SegmentLBLps NMOS transistors MLBLse for respectively connecting Even LBLs andNMOS transistors MBLso for respectively connecting Odd LBLs as seen inFIG. 9C.

In a specific embodiment, the VLBL-DA provides a LBL prechargingfeature. Referring to FIG. 9C. During the precharging on 4 KB Odd or 4KB C_(LBL) lines, such as LBLe[1] to LBLe[N] and LBLo[1] to LBLo[N],Vref is set to Vinh. In an alternative embodiment, Vref=1.0V during thedischarging of 4 KB Odd or 4 KB Even C_(LBL) lines that will bereflected on LBLps line. In a worst-case scenario, there may be just oneor a few C_(LBL) capacitors discharged during m-page Read,Program-Verify, and Erase-Verify operations. Therefore, one setup ofmaximum allowed discharge time should be incorporated. The dischargetime can be set less than 3 μs for this shorter metal1 LBL line.

During the precharging step, initially LBLps line is at 0V. When theprecharging is initiated, the voltage of each selected LBLps line risesto Vinh (≧7V) in accordance with the following bias conditions:LBLps=Vinh while PREe≧Vinh+Vt if 4 KB Even C_(LBL)s are selected orPREo≧Vinh+Vt if 4 KB Odd C_(LBL)s are selected. The Vinh prechargingcurrent will flow from a LBLps voltage generator to the selected 4 KBC_(LBL) capacitors through 4 KB MLBLse or MLBLso transistors. All nodesof LBL[1] to LBL[N] (regardless of odd or even LBLs) will be charged upas well as one common LBLps supply line or the sensing line. When allC_(LBL) capacitors are charged to Vinh, that will make LBLps=Vinh aswell. Thus, once VLBL-DA detects LBLps=Vinh, it means all 8 KBC_(LBL)s=Vinh. That is so-called C_(LBL) detection voltage which isreflected in each common LBLps line.

During the discharging step, initially LBLps is at Vinh. When thedischarging is initiated by one of VR1 to VR3 or Vtamin, Vtbmin orVtcmin or Verase=0V, the dropping voltage of each selected C_(LBL) willbe decreased and reflected on corresponding LBLps line. In order toprevent that the leakage of discharging C_(LBL) pulls down theun-selected C_(LBL)s, the VLBL-DA sensing line detection has to bedesigned in high-speed so that once LBLps line is pulled down to about2V, the VLBL-DA can respond it to shut off the conductions of alltransistors MLBLse and MLBLso to prevent further leakage between anyadjacent LBL lines. This can be done by setting PREe=PREo=Vss once thedropping to 1.0V of C_(LBL) is detected.

In an alternative embodiment, the present invention provides a methodfor randomly erasing one or more pages selected from multiple Blocks ofthe HiNAND2 memory array. Table 8 shows a preferred set of bias voltageconditions and selected WL number per Block for Erase. The bias voltageconditions are kept the same for selected WL and TPW and deep N-well asthe prior-art NAND because same cell structure and channel FN-tunnelingscheme are used.

TABLE 8 Selected Block Voltage Number WLs (randomly selected)  0 V X WLs(randomly unelected) Floating Y SSL Floating 1 GSL Floating 1 TripleP-well 20 V 1 Deep N-well 20 V 1 P-sub  0 V 1where X+Y=64 and X≧1, Y≧0 for a 64-cell String. X=number of selected WLsand Y=number of non-selected WLs. For example, only 10 random WLs infirst selected Block and 8 random selected WLs in second selected Blockare selected for the preferred Dispersed Block Erase, as seen in Table9. A simple description of the inventive concept is shown below.

TABLE 9 An example with 18 random pages from two random Blocks selectedfor Erase Selected Block Voltage Number WL1, WL5, WL7, WL8, WL9, WL15,WL20, WL31, 0 V X = 10 WL40, WL60 (10 selected WLs in selected Block 1)WL2-4, WL6, WL10-14, WL16-19, WL21-30, Floating Y = 54 WL41-59, WL61-64(54 unselected WLs in selected Block 1) SSL1 and GSL1 (In selectedBlock 1) Floating 2 WL10, WL15, WL29, WL30, WL43, WL48, 0 V X = 8 WL50,WL64 (8 selected WLs in selected Block 2) WL1-9, WL11-14, WL16-28,WL31-42, WL44-47, Floating Y = 56 WL51-63 (56 unselected WLs in selectedBlock 2) SSL2 and GSL2 (Block 2) Floating 2 WL1-WL64, and SSL and GSL(In other unselected Floating X = 0, erased Blocks) Y = 64 Triple P-well20 V 1 Deep N-well 20 V 1 P-sub 0 V 1 Where X + Y = 64 and X ≧ 1, Y ≧ 0for a 64-cell String.

The above bias conditions are set and addressed in accordance with thecircuit of Block-decoder shown in FIG. 2A and the corresponding rowAddresses stored within on-chip Address Buffers which are omitted forsimpler description. The set of each Block-Erase conditions are done onblock-by-block basis. Once all Erase conditions are set and latched,then the Erase can be started as long as no other array operations areongoing. Then, a 20V can be applied to the selected P-well and N-well ofselected NAND planes. Then those cells in selected WLs=Vss will geterased but for those cells in the unselected WLs=Vdd-Vt (floating) wouldbe also coupled up to 20V+Vdd-Vt=21V. Thus as prior art NAND, theunselected NAND cells' channel and gate voltage drop is only 1V, noErase FN tunneling would happen. No Vt of those unselected NAND cellswill be changed at all. The erase time of each iterative erase timepulse can be applied and then verified with a spec limit of total 2-5 mswill be automatically controlled by an on-chip timer as well known tothose are skilled in NAND art.

In summary, the definition of a random page Erase of the presentinvention means one or more random pages or WLs in one or more NANDBlocks are simultaneously erased. Here, each physical Block is assumedto include 8 KB physical NAND cells in one physical WL or page.

In prior-art NAND, the minimum erase size is one Block that contains 8KB BLs and 64 WLs provided a 64-cell NAND String. When a Block isselected for erasing, all 8 KB BLs, SSL and GSL lines are left floatingbut the selected 64 WLs are coupled to Vss and the common P-well iscoupled to a HV Erase voltage of 20V. As a result, all 8 KB×64 cells inone selected Block would get erased simultaneously and their Vt wouldbecome lower≦−0.5V typically within 2-5 ms. The unselected WLs, SSL, andGSL lines in the unselected Blocks sharing the same P-well and N-wellare left floating to avoid to be erased. Conversely, the Erase operationaccording to an embodiment of the present invention is to allow anyarbitrary number of WLs or Pages to be selected for erasing. Theselected number can be 1 to 64 per selected Block without anyrestriction of WL locations in each selected Block. Multiple random WLsin multiple Blocks can be selected for a concurrent Erase within thesame erase time of 2-5 ms. This preferred Erase scheme is termed asDispersed Logic Block-Erase. A novel 2-step Erase scheme of DispersedBlock-Erase to lock the erased gate voltages for multiple selected WLand non-selected WL is disclosed. This random Block-Erase scheme can beapplied to any 2D and 3D NAND flash memory.

Lastly, the preferred Dispersed Erase size varies from the smallest sizeof one single physical WL to the biggest size of several Blocks,depending on the capability of on-chip Erase pump and erase-time spec.Even some random pages mixed some full Blocks can be selected for thispreferred Dispersed Block-Erase operation. But for the best benefit insubsequent m-page concurrent SLC and MLC Program operation in savinghighest Program time, one WL per one Block and one Block in one Segmentin one or more Groups in one or more NAND planes is preferred. In aboveexample of two Blocks with total 18 WLs selected for erasing, thesubsequent m-page SLC and MLC Program can only be performed on two WLsfor one selected WL in Block 1 and one selected WL in Block 2. Total 18erased WLs cannot be performed a multi-page Program on 18 WLssimultaneously because these 18 WLs are not equally distributed in 18dispersed Blocks in 18 dispersed Segments. Embodiments of the presentinvention disclose a Dispersed Block-Erase scheme with much improvedflexibility in terms of erase size and erase WL number. It has almost norestriction at all. But for the seek of m-page concurrent SLC and MLCProgram design benefit, a restriction to arrange m dispersed WLs indispersed Segments for erase is needed.

In another embodiment, the present invention provides a method forperforming random page(s) concurrent Erase-Verify operation. Asmentioned previously, unlike prior-art NAND's Physical Erase-Verify, therandom m page(s) concurrent Erase-Verify operation is an Erase-Verifyoperation conducted in dispersed logic Blocks, which is like a m-pageconcurrent SLC and MLC Read scheme described earlier. The majordifference is that an Erase-Verify voltage for multiple selected WLs is0V and bias voltages for unselected WLs are set to Vread, SSL set to Vddand GSL set to Vread, where Vread>Vtcmax with 1.0V margin as explainedin FIG. 4. All m erased pages are preferably performed Erase-Verifyoperation after a lengthy erase time collectively and concurrently toreduce the Erase-Verify time. In addition, an iterative Dispersed M-pageErase-Verify operation can be performed the same as the m-page Read andProgram-Verify operation.

In above two selected Erase Blocks (see Table 9), the Erase-Verifyconditions are set below:

-   -   1) Block 1:    -   a) Selected WLs:        -   WL1=WL5=WL7=WL8=WL9=WL15=WL20=WL31=WL40=WL60=Vss.    -   b) Unselected WLs:        -   WL2-4=WL6=WL10-14=WL16-19=WL21-30=WL41-59=WL61-64=Vread=6V.    -   c) C_(LBL)=Vinh (Precharged LBL voltage).    -   d) SSL=Vdd but GSL=Vread.    -   2) Block 2:    -   a) Selected WLs:        -   WL10=WL15=WL29=WL30=WL43=WL48=WL50=WL64=Vss.    -   b) Unselected WLs:        -   WL1-9=WL11-14=WL16-28=WL31-42=WL44-47=WL51-63=Vread=6V.    -   c) C_(LBL)=Vinh (Precharged LBL voltage).    -   d) SSL=Vdd but GSL=Vread

When a C_(LBL) capacitor voltage corresponding to the selected page isdischarged from initial Vinh to 0V, it means all selected WL cells'Vte<−0.5V and Erase-Verify is passed with success. Otherwise, if theC_(LBL) capacitor voltage remains at Vinh. Thus Erase-Verify fails andfurther Erase iteration is required. The Erase step will be continuedand will only be stopped until the corresponding C_(LBL) capacitorvoltage drops to 0V. The above selected erase Blocks of Block 1 andBlock 2 can be performed Erase-Verify on the same time.

Although the above has been illustrated according to specificembodiments, there can be other modifications, alternatives, andvariations. It is understood that the examples and embodiments describedherein are for illustrative purposes only and that various modificationsor changes in light thereof will be suggested to persons skilled in theart and are to be included within the spirit and purview of thisapplication and scope of the appended claims.

What is claimed is:
 1. A HiNAND2 memory chip with two-level bit line(BL) hierarchy array structure for random concurrent and pipeline NANDoperations, the HiNAND2 memory chip comprising: a plane of NAND cellsformed on a common Triple-Pwell (TPW) region over a deep-Nwell region ona P-substrate, the plane comprising a first plurality of Groups arrangedin column direction, each Group being associated with a first number ofglobal bit lines (GBLs) arranged in the column direction as top-levelmetal lines and being separated from adjacent Groups by a row ofGroup-divided devices, each Group comprising a second plurality ofSegments arranged in the column direction, each Segment being associatedwith a second number of local bit lines (LBLs) disposed as lower-levelmetal lines in parallel to the GBLs and being separated from adjacentSegments by a row of Segment-divided devices, each GBL being coupled toone or more LBLs respectively by one or more Segment-select transistorsgated respectively with one or more SEG signals, the second number ofLBLs being coupled to a common power line respectively through a row ofP/D transistors commonly gated by corresponding one of PRE signals, eachSegment comprising a third plurality of Blocks wherein each Blockcomprises the second number of Strings one-to-one parallelly coupled tothe second number of LBLs and each String comprising M cells connectedin series and capped by a top String-select transistor and a bottomString-select transistor, the top String-select transistor connectingits drain to one corresponding LBL, the bottom String-select transistorconnecting its source to a common source line disposed in parallel tobut not connected to the common power line, the second number of Stringsin a Block forming M pages of cells respectively gated by M word lines(WLs) and all corresponding top String-select transistors are gated by aSSL line and all corresponding bottom String-select transistors aregated by a GSL line; a data buffer configured to store the first numberof bits of partial page SLC or MLC data received from external I/Oswhich are transferred and stored in selected second number of LBLsprecharged locally with an inhibit voltage Vinh from the correspondingcommon power line in one more sequential cycles through the first numberof GBLs; a Segment-decoder configured with a latch to control one ormore SEG signals for controlling connection between each GBL and one ormore corresponding LBLs; a Block-decoder configured with a latch toconnect or disconnect a set of M XTs, 1 SSLp, and 1 GSLp bus linesshared to m sets of M WLs, 1 SSL line, and 1 GSL line onone-set-per-Block basis, m being an integer equal to 1 or greater;wherein m pages of cells being selected on one-page-per-Block basis fromm Blocks selected on one-Block-per Segment basis from the secondplurality of Segments of one or more of the first plurality of Groups inthe plane are configured to perform m-page all-BL concurrent operationsof Erase, Program, Verify, or Read of mixed SLC and MLC data byexecuting LBL-precharge/discharge, WL-precharge/discharge in concurrentand pipeline manner with self-timed controls of transferring andlatching an inhibit voltage Vinh precharged through the common powerline per Segment to the LBLs and m sets of M WL voltages, 1 SSL voltage,and 1 GSL voltage through the set of M XTs, 1 GSLp, and 1 SSLp buslines.
 2. The HiNAND2 memory chip of claim 1 wherein the m-page all-BLconcurrent operations includes concurrent Erase operation of arbitrarynumber m pages selected from arbitrary Blocks in one or more Segments inone or more Groups in the plane, mixed concurrent SLC and MLC all-BLProgram operation of m pages selected from m Blocks in m Segments of oneor more Groups, mixed concurrent SLC and MLC all-BL Program-Verifyoperation of m pages selected from m Blocks in m Segments of one or moreGroups, concurrent all-BL Erase-Verify operation of m pages selectedfrom m Blocks in m Segments of one or more Groups, and mixed concurrentSLC and MLC all-BL Read operation of m pages selected from m Blocks in mSegments of one or more Groups, mixed combination of concurrent SLC/MLCall-BL Program, Read, Program-Verify, Erase-Verify operations of m pagesselected from m Blocks in m Segments of one or more Groups, inconcurrent and pipeline manner.
 3. The HiNAND2 memory chip of claim 1wherein the WL voltages include a program voltage Vpgm of 15˜25V, a passvoltage Vpass of 9˜11V, and a read voltage V_(READ) of 6˜8V for variousNAND operations.
 4. The HiNAND2 memory chip of claim 3 wherein the msets of M WL voltages, 1 SSL voltage, and 1 GSL voltage are subjected toself-timed controls on detection, precharge, discharge, and latching ofspecific voltage levels for performing m-page all-BL Program,Program-Verify, Erase-Verify, and Read operations in concurrent andpipeline manner.
 5. The HiNAND2 memory chip of claim 1 wherein thesecond number is twice of the first number so that each GBL per Group isassociated with two LBLs per Segment respectively coupled by a firstSegment-select transistor gated by a SEGo signal for an odd-numbered LBLand a second Segment-select transistor gated by a SEGe signal for aneven-numbered LBL.
 6. The HiNAND2 memory chip of claim 5 wherein theSEGe signal and SEGo signal are independently set with bias conditionsof setting the SEGe signal greater than the Vinh and the SEGo signal Vssto select the even-numbered LBLs of one selected Segment only forrespectively charge-sharing with the corresponding first number of GBLs;setting the SEGo signal greater than the Vinh and the SEGe signal to Vssto select the odd-numbered LBLs of one selected Segment only forrespectively charge-sharing with the corresponding first number of GBLs;setting both the SEGe signal and the SEGo signal to Vss for preventingany LBL-GBL charge-sharing operation.
 7. The HiNAND2 memory chip ofclaim 1 wherein the second number is four times of the first number sothat each GBL per Group is associated with four LBLs per Segmentrespectively coupled by a first Segment-select transistor gated by aSEGa signal for a first LBL, a second Segment-select transistor gated bya SEGb signal for a second LBL, a third Segment-select transistor gatedby a SEGc signal for a third LBL, and a fourth Segment-select transistorgated by a SEGd signal for a fourth LBL, wherein each only one of theSEGa, SEGb, SEGc, and SEGd signals is set to be greater than Vinh forallow corresponding ¼ number of LBLs of one selected Segment to performLBL-GBL charge-sharing with the corresponding first number of GBLs. 8.The HiNAND2 memory chip of claim 1 wherein each Group-divided device,each Segment-divided device, each Segment-select transistor, each P/Dtransistor, each top String-select device, and each bottom String-selectdevice is a same type NMOS 1-poly medium-high-voltage (MHV) transistor.9. The HiNAND2 memory chip of claim 1 wherein M is selected from 8, 16,32, 64, 128 or other integer numbers depending on NAND design density.10. The HiNAND2 memory chip of claim 1 wherein the second number is65,536 for 8 KB Page size and configured to couple with the data bufferscaled down to 4 KB size or 2 KB size.
 11. The HiNAND2 memory chip ofclaim 1 wherein the data buffer comprises three circuits with same bitlength, including a Multiplier circuit per bit for a first amplificationof a small analog cell signal to an multiplied analog signal, alatch-type Sense Amplifier (SA) circuit per bit for a second analogamplification of the multiplied analog signal and convert to a fulldigital signal, and a Program/Read buffer (P/RB) circuit per bit fortemporarily storing 1-bit data.
 12. The HiNAND2 memory chip of claim 11wherein the data buffer comprises total number of bits equal to thefirst number of GBLs, the total number of bits in the data buffer beingreduced by half if the second number of LBLs represented to a number ofNAND cells in a page is twice of the first number of GBLs, the totalnumber of bits being further scaled down to ¼ if the second number ofLBLs is four times of the first number of GBLs.
 13. The HiNAND2 memorychip of claim 11 wherein the P/RB circuit is configured to set “0” bitdata to pass Vss voltage to channel of a program cell through eachcorresponding GBL and LBL and to set “1” bit data to pass Vdd voltage tochannel of a program-inhibit cell through each corresponding GBL andLBL.
 14. The HiNAND2 memory chip of claim 11 wherein the P/RB circuitcomprises a pair of latch nodes as a first pair of storage nodes and apair of gated capacitors as a second storage nodes so that an extratemporary storage bits are created to allow more flexible m-page all-BLconcurrent MLC MSB and LSB Program-Verify and bit flipping logicoperations.
 15. The HiNAND2 memory chip of claim 11 wherein theMultiplier circuit comprises an input port receiving a first analogvoltage coupled to every drain node of N+1 first transistors and anoutput port outputting a second analog voltage from a drain node of afirst one of the N+1 first transistors, the Multiplier circuit furthercomprises N capacitors being respectively inserted between two drainnodes of two adjacent first transistors, the Multiplier circuit furthercomprises N second transistors being respectively coupled drain nodes oflast N first transistors and ground, thereby outputting the secondanalog voltage equal to N-fold of the first analog voltage, where N isan integer≧1.
 16. The HiNAND2 memory chip of claim 1 wherein theSegment-decoder comprises a three-input pre-decoder and n number oflatch circuits and a local HV pump circuit with n SEGp inputs, a VHHinput, and corresponding n outputs of n SEG signals respectively forselecting partial section of a Segment, and further comprises aplurality of control signals to respectively set and clear the n latchcircuits and enable and disable the local HV pump circuit to determine aHXS node voltage for controlling voltage charging, latching, anddischarging between the n SEGp inputs and corresponding n outputsconnected to the n SEG signals per Segment for performing multi-pageall-BL Program, Program-Verify, Erase-Verify, and Read operation. 17.The HiNAND2 memory chip of claim 16 wherein the Segment-decodercomprises a function to instantly set the HXS node to Vss by setting oneESB signal of the plurality control signals with one-shot pulse of Vddfor a preset duration when an unintentional Vdd power lose is detected,allowing the n SEG signals to be set to Vss so that inhibit voltage Vinhprecharged to the LBLs can be immediately saved after unexpectedpower-down but can be reused to continue the operations after power backwithin a certain idle time.
 18. The HiNAND2 memory chip of claim 1wherein the Block-decoder comprises a latch circuit coupled with onepre-decoder with three address inputs and a local HV pump circuit with aset of M+2 inputs of M XTs, 1 GSLp, and 1 SSLp, a VHH input, andcorresponding M+2 outputs coupled to one set of M WLs, SSL, and GSLlines per Block, and further comprises a plurality of control signals toset and clear the latch circuit and enable and disable the local HV pumpcircuit to determine a HXD node voltage for controlling voltagecharging, latching, and discharging between the M+2 inputs of M XTs, 1GSLp, and 1 SSLp and M+2 outputs connected to a set of M WLs, SSL, andGSL lines of a selected Block for m-page all-BL Program, Program-Verify,Erase-Verify, and Read operation.
 19. The HiNAND2 memory chip of claim18 wherein the HXD node is controlled to be a program voltage Vpgmramped to 15˜25V plus a cell threshold Vt margin during a SLC or MLCProgram operation, or to be a read voltage of Vread of 6˜8V plus a Vtmargin during a SLC or MLC Read operation or a SLC or MLC Program-Verifyoperation or a SLC or MLC random-page Erase-Verify operation, or to beVss for latching voltages for a set of M WLs, SSL, and GSL lines of aselected Block.
 20. The HiNAND2 memory chip of claim 18 wherein theBlock-decoder further comprises a function to immediately set the HXDnode to Vss by setting an ENB signal of the plurality of control signalswith one-shot pulse of Vdd for a preset duration when an unintentionalVdd power lose is detected, allowing the voltages of one set of M WLs,SSL, and GSL lines of the selected Block to be locked to continue lastoperation.
 21. The HiNAND2 memory chip of claim 1 further comprising aphysical CACHE Register made of a glue logic circuit associated with thedata buffer for storing inputted page data of the first number bitlength for performing m-page all-BL concurrent and pipeline operation.22. The HiNAND2 memory chip of claim 1 wherein the second number of LBLsassociated with a first/second/third/fourth Segment comprise the secondnumber of metal parasitic capacitors serving as afirst/second/third/fourth pseudo CACHE register with the second numberof bits, the second Segment being paired with the first Segment byconnecting a row of Segment-divided devices and the fourth Segment beingpaired with the third Segment by connecting another row ofSegment-divided devices.
 23. The HiNAND2 memory chip of claim 22 whereineither one of the first, the second, the third, and the fourth pseudoCACHE register is termed a CACHEcel using corresponding the secondnumber of metal parasitic capacitors to temporarily store and latchcurrent new SLC page data of the second number of bits whenever aselected page of a selected Block is in the corresponding one of thefirst, the second, the third, and the fourth Segment, another one of thefour pseudo CACHE registers associated with another Segment paired withthe Segment having the selected page is termed a CACHEint to temporarilystore and latch last one or more partial page transient SLC data of thefirst number of bits for performing m-page concurrent and pipelineoperation, one of remaining two pseudo CACHE registers is termed aCACHEmsb to temporarily store and latch last one or more partial pageMLC MSB page data of the first number of bits, and last pseudo CACHEregister is termed a CACHE1sb to temporarily store and latch last one ormore partial page MLC LSB data of the first number of bits forperforming m-page all-BL concurrent and pipeline operation.
 24. TheHiNAND2 memory chip of claim 23 wherein the CACHEint is configured to bepaired with the CACHEcel by residing respectively in a pair of Segmentshaving each corresponding LBL connected by a Segment-divided devicegated by a common TIE signal, wherein the TIE signal is set to begreater than Vinh for concurrently precharging the metal parasiticcapacitors associated with the corresponding LBLs and is set to Vss forisolating each other to allow one pseudo CACHE to retain and latch thecharges therein while the paired pseudo CACHE to discharge or performLBL-GBL charge-sharing for m-page concurrent operation.
 25. The HiNAND2memory chip of claim 22 wherein each of the first/second/third/fourthpseudo CACHE register is configured to store and latch externally-loadedSLC, MLC MSB, and MLC LSB page data converted to Vinh/Vss pattern,internally-generated transient page data in Vinh/Vss pattern during MLCB′-adjustment before All-BL Program, internally-generated transient pagedata in Vinh/Vss pattern during MLC Program-Verify, internally-generatedpage data precharged Vinh; and internally-generated in Vinh/Vss patternby Read operation.
 26. The HiNAND2 memory chip of claim 1 wherein eachGBL of a selected Group controllably connects one LBL of a selectedSegment in the selected Group via one Segment-select transistor andfurther optionally connects to other GBLs depending on location of theselected Group relative to the data buffer by turning on correspondingGroup-divided devices to provide a controllable DRAM-like charge-sharingscheme between a lower-level metal parasitic capacitor associated withthe one LBL in the selected Segment of the selected Group and one ormore top-level metal parasitic capacitors respectively associated withthe selected Group and other connected Groups for performing each SLC orMLC random page Read or Program-Verify operation.
 27. The HiNAND2 memorychip of claim 26 wherein each lower-level metal parasitic capacitor ischarged up to the inhibit voltage Vinh locally from the common powerline per Segment by turning on a corresponding P/D transistor and allthe second number of LBLs are configured to have the correspondinglower-level metal parasitic capacitors being charged concurrently in onecycle from the common power line or partial number of the second numberof LBLs are configured to have the corresponding lower-level metalparasitic capacitors being charged in multiple cycles from the commonpower line.
 28. The HiNAND2 memory chip of claim 26 wherein thecharge-sharing scheme includes ½ or ¼ of the second number of LBLs inone selected Segment connected with the corresponding the first numberof GBLs by turning on corresponding one row of Segment-selecttransistors, depending on whether the second number is twice or fourtimes of the first number, while rest of ½ or ¾ of the second number ofLBLs being disconnected with the corresponding GBLs by turning off theremain one or three rows of Segment-select transistors to retain theinhibit voltage Vinh without charge-sharing but ready for subsequentcharge-sharing operation.
 29. The HiNAND2 memory chip of claim 26wherein the inhibit voltage Vinh comprises a variable value ranging fromVdd for the selected Segment in a selected Group nearest to the databuffer to 10V for the selected Segment in a selected Group farest to thedata buffer to save LBL precharge power consumption, wherein the Vinhvalue is one between Vdd and 10V for the selected Segment in a selectedGroup between the nearest one and farest one depending on how manynumber of Groups are connected from the selected Group to the databuffer.
 30. The HiNAND2 memory chip of claim 1 wherein the planecomprises at least one hybrid Block having interleavely mixed pages forrespectively storing SLC and MLC data for reducing MLC WL-WL couplingeffect.
 31. The HiNAND2 memory chip of claim 30 wherein the hybrid Blockis configured to place all SLC WLs in odd-numbered WLs thereof and allMLC WLs in even-numbered WLs thereof.
 32. The HiNAND2 memory chip ofclaim 1 wherein the first number of GBLs are top-level metal lines inparallel to the column direction characterized by a pitch size of 4 or 8base units and a length of one Group size and the second number of LBLsare lower-level metal lines below the GBLs in parallel to the columndirection characterized by a pitch size of 2 base units and a length ofone Segment.
 33. The HiNAND2 memory chip of claim 1 wherein each commonpower line per Segment and each common source line per Block areconductive lines made of metal or conductive polymer formed below theLBLs and perpendicular to the column direction.
 34. The HiNAND2 memorychip of claim 5 wherein the common power line associated with a selectedSegment is configured to receive the inhibit voltage Vinh up to 10V froman off-chip voltage generator and connect to the first number ofodd-numbered LBLs respectively via a row of P/D transistors gated by aPREo signal and the first number of even-numbered LBLs respectively viaanother row of P/D transistors gated by a PREe signal, wherein the PREosignal and PREe signal are set for controlling precharge of the inhibitvoltage Vinh to corresponding LBLs associated with the selected Segmentwith at least one page selected for performing m-page concurrent andpipeline operation.
 35. The HiNAND2 memory chip of claim 34 wherein theinhibit voltage Vinh up to 10V precharged to LBLs of selected Segmentprovides a program-inhibit voltage larger than Vdd for Program operationin super self-boosting Program-Inhibit (SSBPI) scheme with reducedchannel voltage boosting.
 36. The HiNAND2 memory chip of claim 34wherein the LBLs with the precharged Vinh in a selected Segment areconfigured to partially discharge the Vinh from some LBLs to Vss andpartially retain the precharged Vinh for other LBLs by controlling oneor more rows of Segment-select transistors to open connection betweenthe LBLs and corresponding GBLs and also controlling one or more rows ofP/D transistors to close connection the LBLs to the common power line toconvert a partial page data with a Vdd/Vss pattern stored in the databuffer to a corresponding partial page data with a Vinh/Vss patternstored in the LBLs of the selected Segment for m-page all-BL concurrentand pipeline operation.
 37. The HiNAND2 memory chip of claim 7 whereinthe common power line associated with a selected Segment is configuredto receive an inhibit voltage Vinh up to 10V from an off-chip voltagegenerator and connect to the first number of first LBLs respectively viaa first row of P/D transistors gated by a PREa signal, to the firstnumber of second LBLs respectively via a second row of P/D transistorsgated by a PREb signal, to the first number of third LBLs respectivelyvia a third row of P/D transistors gated by a PREc signal, and to thefirst number of fourth LBLs respectively via a fourth row of P/Dtransistors gated by a PREd signal, wherein the PREa, PREb, PREc, andPREd signals are for controlling precharge of the inhibit voltage Vinhto corresponding LBLs associated with the selected Segment with at leastone page selected for performing m-page all-BL concurrent and pipelineoperation.
 38. The HiNAND2 memory chip of claim 1 wherein random numberof pages of NAND cells selected from one or more Blocks of one or moreSegments in one or more Groups in the plane are subjected to aconcurrent Erase operation by setting WLs of all selected pages to 0Vand floating all unselected WLs, SSL lines, GSL lines in any selectedand unselected Blocks, and subsequently ramping voltages of the TPW anddeep N-well of the plane to about 16-20V.
 39. The HiNAND2 memory chip ofclaim 38 wherein the random number of pages comprise m pages selected onone-page-per-Block basis from m Blocks selected on one-Block-per-Segmentfrom m Segments basis in one or more Groups in the plane so that them-page concurrent SLC and MLC Program operation can be performed afterthe concurrent Erase operation.
 40. The HiNAND2 memory chip of claim 1wherein the m pages selected from one or more Blocks in one or moreSegments of one or more Groups are subjected to m-page Erase-Verifyoperation by setting WLs of all selected m pages to 0V, settingunselected WLs in each selected Block to Vread of ˜6V, setting SSL linesand GSL lines of all selected Blocks to Vdd or Vread, setting unselectedWLs, SSL lines and GSL lines of unselected Blocks to 0V, precharging andlatching all LBLs associated with the selected Segments to Vinh,enabling a Multiplier, a Sense Amplifier and a Program/Read Buffer inthe data buffer, and setting the TPW to 0V and deep N-well to Vdd in theplane.
 41. The HiNAND2 memory chip of claim 40 wherein the m-pageErase-Verify operation comprises iterative operation steps of utilizingthe Multiplier, the Sense Amplifier and the Program/Read Buffer througha LBL-GBL charge-sharing scheme to determine if the precharged inhibitvoltage Vinh on the LBLs in a selected Segment containing one or morepages for Erase-Verify drops to Vss, then floating the WLs correspondingto the one or more pages in the selected Segment to be erase-inhibited;else if the precharged inhibit voltage Vinh on the LBLs in a selectedSegment does not drop to Vss, further performing Erase operation on thecorresponding one or more pages.
 42. The HiNAND2 memory chip of claim 22wherein the m pages of NAND cells are selected on one-page-per-Blockbasis from m Blocks selected on one-Block-per-Segment basis from mSegments in one or more Groups in the plane for respectively loading andlatching either SLC or MLC (MSB) page data from external I/Os to mpseudo CACHEs in 2 m cycles in pipeline manner if the second number ofLBLs is twice of the first number of GBLs per page, then the m pages ofNAND cells are programmed by performing m-page all-BL SLC or MLC (MSB)Program operation concurrently with full overlapping operation timeintervals if all m pages have a same WL address or with partialoverlapping operation time intervals if the m pages have random WLaddresses.
 43. The HiNAND2 memory chip of claim 42 wherein the all-BLSLC or MLC (MSB) Program operation per cell is configured to increase anumber of cell logic state from one initial erase state to two logicstates including an erase E-state with a threshold level Vt<0V and aninitial program B′-state with a predetermined minimum Vt value set to beno smaller than a maximum Vt value set for a final lowest program logicstate A-state generated in a MLC (LSB) Program operation.
 44. TheHiNAND2 memory chip of claim 43 wherein the initial program B′-statecomprises a Vt distribution divided to two sections of B1′-state andB2′-state by a Vtbmin value preset for a minimum Vt value of a secondprogram state of a programmed MLC cell, wherein the B1′-state andB2′-state are subjected to a B′-adjustment bit-flipping operation uponloading of MLC (LSB) data to generate a final second program logic stateB-state and a final third program logic state C-state in the MLC (LSB)Program operation.
 45. The HiNAND2 memory chip of claim 42 wherein the mpages of NAND cells are selected on one-page-per-Block basis from mBlocks selected on one-Block-per-Segment basis from m Segments in one ormore Groups in the plane for performing m-page All-BL Program-Verifyoperation including precharging Vinh to the second number of LBLsassociated with respective m pages of the m Segments simultaneously inone cycle, sharing charges associated with the precharged Vinh of eachLBL with corresponding connected long GBLs on half-page-by-half-pagebasis in two cycles, if the second number of LBLs is twice of the firstnumber of GBLs per page.
 46. The HiNAND2 memory chip of claim 23 whereinthe m pages of NAND cells are selected on one-page-per-Block basis fromm Blocks selected on one-Block-per-Segment basis from m Segments in oneor more Groups in the plane for sequentially loading and latching them-page MLC (LSB page) data from external I/Os to m sets of 4 pseudoCACHEs in 2 m cycles in pipeline manner if the second number of LBLs istwice of the first number of GBLs per page, each set of 4 pseudo CACHEsincluding a CACHEcel, a CACHEint, a CACHE1sb, and a CACHEmsb.
 47. TheHiNAND2 memory chip of claim 23 wherein the m pages of NAND cells areselected on one-page-per-Block basis from m Blocks selected onone-Block-per-Segment basis from m Segments in one or more Groups in theplane for performing m-page all-BL MLC Program-Verify operation using apredetermined Vtamin verify voltage to firstly verify an A-state of afinal MLC lowest program state by transferring last updated data fromthe CACHEint per bit to two first storage nodes of Program/Read Bufferin the data buffer, transferring MLC MSB page data from the CACHEmsb perbit to a Sense Amplifier in the data buffer through a Multiplier foramplification and then writing back to the CACHEmsb in same datapolarity and transferring into one second storage node of theProgram/Read Buffer, and verifying currently updated data from theCACHEcel in the Sense Amplifier per bit against data in both the firststorage nodes and a second storage node of the Program/Read Buffer perbit based on Vt distribution of the A-state.
 48. The HiNAND2 memory chipof claim 47 wherein the m-page all-BL MLC Program-Verify operationfurther comprises using a predetermined Vtbmin verify voltage to verifya B-state of a second MLC program state by transferring MLC LSB pagedata from the CACHE1sb per bit to the Sense Amplifier through aMultiplier for amplification and then writing back to the CACHE1sb insame data polarity and transferring into one second storage node of theProgram/Read Buffer per bit, verifying currently updated data from theCACHEcel per bit in the Sense Amplifier against data in the firststorage nodes transferred from the CACHEint and data in second storagenode of the Program/Read Buffer per bit based on Vt distribution of theB-state.
 49. The HiNAND2 memory chip of claim 47 wherein the m-pageall-BL MLC Program-Verify operation further comprises using apredetermined Vtcmin verify voltage to verify a C-state of a third MLCprogram state by verifying currently updated data from CACHEcel per bitin the Sense Amplifier against only data in the first store nodestransferred from the CACHEint based on Vt distribution of the C-state,updating verified data in the CACHEcel and the CACHEint, andcontinuously performing next iterative All-BL MLC LSB Program operationbased on the updated data in the CACHEcel until the MLC Program-Verifyoperation is passed.
 50. The HiNAND2 memory chip of claim 47 wherein them-page all-BL MLC Program-Verify operation comprises performing m pagesEven-numbered half-page Program-Verify operations and Odd-numberedhalf-page Program-Verify operations with partially overlapping timeintervals on page-by-page basis in a concurrent/pipeline manner andoptionally with a reversed sequence of the Even-numbered half-pageProgram-Verify operation and the Odd-numbered half-page Program-Verifyoperation once per iterative verify step.
 51. The HiNAND2 memory chip ofclaim 23 wherein the m pages of NAND cells are selected onone-page-per-Block basis from m Blocks selected on one-Block-per-Segmentbasis from m Segments in one or more Groups in the plane for performingm-page All-BL SLC Read operation by using one pseudo CACHE for storingand latching precharged Vinh data in Even-numbered half page andOdd-numbered half-page of the LBLs in two cycles with partiallyoverlapping time interval in a concurrent/pipeline manner regardless ofm random pages or m non-random pages.
 52. The HiNAND2 memory chip ofclaim 23 wherein the m pages of NAND cells are selected onone-page-per-Block basis from m Blocks selected on one-Block-per-Segmentbasis from m Segments in one or more Groups in the plane for performingm-page all-BL MLC (MSB page) Read operation by using one pseudo CACHEfor storing temporary read data for distinguishing an initial programB′-state from erase E-state per cell via Even-numbered half-page andOdd-numbered half-page in two cycles with partially overlapping timeintervals in a concurrent/pipeline manner regardless of m random pagesor m non-random pages with a condition that each page includes a Flagcell assigned to 1 to indicate each addressed MLC-WL storing 2-Vt of MSBpage data.
 53. The HiNAND2 memory chip of claim 23 wherein the m pagesof NAND cells are selected on one-page-per-Block basis from m Blocksselected on one-Block-per-Segment basis from m Segments in one or moreGroups in the plane for performing m-page all-BL MLC (LSB page) Readoperation by using three pseudo CACHEs including CACHEcel, CACHEint, andCACHEmsb to distinguish four logic states per cell including an eraseE-state, a first program A-state, a second program B-state, and a thirdprogram C-state with partially overlapping time intervals onpage-by-page basis in a concurrent/pipeline manner regardless of mrandom pages or m non-random pages with a condition that each pageincludes a Flag cell assigned to 0 to indicate each addressed MLC-WLstoring 4-Vt of both MSB and LSB page data.
 54. The HiNAND2 memory chipof claim 53 wherein the m-page all-BL MLC (LSB page) Read operationcomprises page-based operation steps of, utilizing the CACHEcel and theCACHEint tied as a pair by a row of Segment-divided devices gated by acommon TIE signal to simultaneously store and latch temporary read pagedata via a first predetermined read voltage for distinguishing theE-state from A-state, B-state, and C-state; setting a common gate signalof the row of Segment-divided devices to Vss to isolate the CACHEintfrom the CACHEcel; utilizing the CACHEcel for storing and latchingtemporary read page data via a second predetermined read voltage fordistinguishing the E-state and A-state from the B-state and C-state, thepage data in the CACHEcel being latched with a MSB page data;transferring the MSB page data to a Sense Amplifier of the data bufferper bit and writing a page data having same polarities back and latch tothe CACHEmsb; utilizing the CACHEcel for storing and latching temporaryread page data via a third predetermined read voltage for distinguishingthe E-state, A-state, and B-state from the C-state; restoring the pagedata in the CACHEint to two first storage nodes of a Program/Read Bufferin the data buffer per bit; restoring the page data in the CACHEmsb tothe Sense Amplifier per bit and transferring to two second storage nodesof the Program/Read Buffer per bit; restoring the page data in theCACHEcel to the Sense Amplifier per bit; flipping bit polarity of eachB-state bit per cell in the Program/Read Buffer; and reading out datafrom the P/RB per bit for to obtain MLC LSB page data.
 55. The HiNAND2memory chip of claim 1 wherein each NAND cell comprises thresholdvoltage Vt assignments including 2 Vts for storing each SLC bit, 2 Vtsfor storing each MLC MSB-bit, and 4 Vts for storing both MLC MSB and MLCLSB bits, wherein both 2-Vt SLC bit and 2-Vt MSB bit include two logicstates of an erase state and a common transient program state B′-stateand two 4-Vt MLC bits include four logic states of an erase state, afirst program state, a second program state, and a third program state,the B′-state being initially set to have its Vt minimum value no smallerthan Vt maximum of the first program state.
 56. The HiNAND2 memory chipof claim 1 wherein the m-page concurrent and pipeline operations areperformed by executing a preferred Command sets issued from a Host, theCommand sets comprising a Start code followed by m consecutive pageAddresses followed by m SLC page data or 2 m MLC page data and an Endcode, wherein m≧1.
 57. The HiNAND2 memory chip of claim 1 furthercomprising a WL HV voltage detector including a dummy WL coupled to a WLHV pump circuit, a differential amplifier circuit having a first inputcoupled to the dummy WL to receive a WL HV voltage and a second inputcoupled to a reference voltage generator to receive a reference voltage,the dummy WL being configured to be a middle one of a three-WL layoutfor simulating a worst-case resistance (R) and capacitance (C) of a realword lines having two neighbors, the differential amplifier circuitdetecting the WL HV voltage to reach the reference voltage to trigger ageneration of an output EN signal of an one-shot pulse of Vdd.
 58. TheHiNAND2 memory chip of claim 57 wherein the WL HV voltage comprises a WLprogram voltage independently generated for tracking voltage of aselected page for initiating a self-timed control during m-pageconcurrent and pipeline Program operation or separately a WL readvoltage independently generated for initiating a self-timed controlduring m-page concurrent and pipeline Read, Program-Verify, orErase-Verify operation.
 59. The HiNAND2 memory chip of claim 57 whereinthe dummy WL is initially set to Vss and is increased as the WL isassociated with one of the Program, Read, Program-Verify, andErase-Verify operation, wherein once the voltage of a selected page isreached the reference voltage, the EN signal of one-shot pulse of Vdd issent to the Block-decoder for latching voltages for a corresponding setof M WLs, 1 SSL line, and 1 GSL line.
 60. The HiNAND2 memory chip ofclaim 57 wherein the WL HV voltage detector further comprises a functionof using a 1V as a reference voltage for tracking discharge of a WL froman initial HV voltage.
 61. The HiNAND2 memory chip of claim 1 furthercomprising a LBL voltage detector including a differential amplifier, areference voltage generator, and a sense line coupled directly to eachcommon power line per Segment to detect voltages charged in ordischarged from either partial or total of the second number of LBLs,the voltages being temporarily charged and latched in correspondinglower-level metal lines as metal parasitic capacitors served as pseudoCACHEs for performing m-page concurrent and pipeline Program, Read,Program-Verify, and Erase-Verify operations.
 62. The HiNAND2 memory chipof claim 1 wherein each NAND cell in the plane comprises a high-voltage2-poly floating-gate NMOS transistor, or an 1-poly charge-trapping MONOSor SONOS transistor, or 2-poly floating-gate or 1-poly charge trappingNitride layer 3D transistor, or Vertical-gate or Vertical channel 3DNAND transistor.
 63. The HiNAND2 memory chip of claim 1 wherein thesecond number is zero with the LBLs and corresponding Segment-selecttransistor being removed to form a HiNAND1 array with 1-level BLstructure for performing m-page concurrent Program, Read, and Verifyoperation.
 64. A method for operating a HiNAND2 memory chip withtwo-level bit line (BL) hierarchy array structure during multi-pageconcurrent SLC or MLC (MSB) Program and Program-Verify operation, themethod comprising: providing a HiNAND2 memory chip in connection with ahost and a flash controller via I/Os circuit, the HiNAND2 memory chipcomprising: a plane of NAND cells formed on a common Triple-Pwell (TPW)region over a deep-Nwell region on a P-substrate, the plane comprising afirst plurality of Groups arranged in column direction, each Group beingassociated with a first number of global bit lines (GBLs) arranged inthe column direction as top-level metal lines and being separated fromadjacent Groups by a row of Group-divided devices, each Group comprisinga second plurality of Segments arranged in the column direction, eachSegment being associated with a second number of local bit lines (LBLs)disposed as lower-level metal lines in parallel to the GBLs and beingseparated from adjacent Segments by a row of Segment-divided devices,each GBL being coupled to one or more LBLs respectively by one or moreSegment-select transistors gated respectively with one or more SEGsignals, the second number of LBLs being coupled to a common power linerespectively through a row of P/D transistors commonly gated bycorresponding one of PRE signals, each Segment comprising a thirdplurality of Blocks wherein each Block comprises the second number ofStrings one-to-one parallelly coupled to the second number of LBLs andeach String comprising M cells connected in series and capped by a topString-select transistor and a bottom String-select transistor, the topString-select transistor connecting its drain to one corresponding LBL,the bottom String-select transistor connecting its source to a commonsource line disposed in parallel to but not connected to the commonpower line, the second number of Strings in a Block forming M pages ofcells respectively gated by M word lines (WLs) and all corresponding topString-select transistors are gated by a SSL line and all correspondingbottom String-select transistors are gated by a GSL line; a data bufferconfigured to store the first number of bits of partial page SLC or MLCdata received from external I/Os which are transferred and stored inselected second number of LBLs precharged locally with an inhibitvoltage Vinh from the corresponding common power line in one moresequential cycles through the first number of GBLs; a Segment-decoderconfigured with a latch to control one or more SEG signals forcontrolling connection between each GBL and one or more correspondingLBLs; a Block-decoder configured with a latch to connect or disconnect aset of M XTs, 1 SSLp, and 1 GSLp bus lines shared to m sets of M WLs, 1SSL line, and 1 GSL line on one-set-per-Block basis, m being an integerequal to 1 or greater; precharging the inhibit voltage Vinh up to 10Vsimultaneously in one cycle from the common power line per Segment tothe second number of LBLs associated with m pages selected onone-page-per-Block basis from m Blocks selected on one-Block-per-Segmentbasis from m Segments in one or more Groups in the plane from the commonpower line per selected Segment, m being an integer greater than one;loading and latching SLC or MCL (MSB) page data from the data buffer inpipeline manner up to m pages into selected precharged LBLs in 2 or 4cycles depending on if the second number of LBLs is twice or four timesof the first number of GBLs per page, m being an integer greater than 1,each of the m pages being selected on one-page-per-Block basis from mBlocks selected on one-Block-per-Segment basis from m Segments in one ormore Groups in the plane; setting and latching corresponding voltages tom sets of M WLs, 1 SSL line, and 1 GSL line for the m selected Blockseach with one selected page; selectively discharging-and-retaining theprecharged inhibit voltage Vinh in the second number of LBLs onpage-by-page basis simultaneously in one cycle; performing m-pageProgram concurrently for increasing each cell logic state from oneinitial erase state to two logic states including an erase E-state witha threshold level Vt<0V and an initial program B′-state with apredetermined minimum Vt value set to be no smaller than a maximum Vtvalue set for a final lowest program logic state A-state to be generatedin a MLC (LSB) Program operation; precharging the Vinh to two sets ofthe second number of LBLs of two paired Segments and simultaneously inone cycle and latching the precharged Vinh to one set of the secondnumber of LBLs; sharing charges associated with the precharged Vinh ofthe other sets of the second number of LBLs respectively with the firstnumber of GBLs in two or four cycles; and using a minimum value of Vtdistribution of the B′-state to confirm successful Program.
 65. Themethod of claim 64 wherein performing m-page Program concurrentlycomprises programming the m pages concurrently with full overlappingoperation time intervals if all the m pages have a same WL address orprogramming the m pages with partial overlapping operation timeintervals if the m pages have random WL addresses.
 66. The method ofclaim 64 wherein the second number of LBLs associated with afirst/second/third/fourth Segment comprise the second number of metalparasitic capacitors serving as a first/second/third/fourth pseudo CACHEregister with the second number of bits, the second Segment being pairedwith the first Segment by connecting a row of Segment-divided devicesand the fourth Segment being paired with the third Segment by connectinganother row of Segment-divided devices.
 67. The method of claim 66further comprising using the Segment-decoder per Segment configured tolatch the precharged inhibit voltage Vinh in the second number of LBLsassociated with the m pages to corresponding m pseudo CACHEs in 2 mcycles in pipeline manner.
 68. The method of claim 66 wherein either oneof the first, the second, the third, and the fourth pseudo CACHEregister is termed a CACHEcel using corresponding the second number ofmetal parasitic capacitors to temporarily store and latch current newSLC page data of the second number of bits whenever a selected page of aselected Block is in the corresponding one of the first, the second, thethird, and the fourth Segment, another one of the four pseudo CACHEregisters associated with another Segment paired with the Segment havingthe selected page is termed a CACHEint to temporarily store and latchlast one or more partial page transient SLC data of the first number ofbits for performing m-page concurrent and pipeline operation, one ofremaining two pseudo CACHE registers is termed a CACHEmsb to temporarilystore and latch last one or more partial page MLC MSB page data of thefirst number of bits, and last pseudo CACHE register is termed aCACHE1sb to temporarily store and latch last one or more partial pageMLC LSB data of the first number of bits for performing m-page all-BLconcurrent and pipeline operation.
 69. The method of claim 68 whereinthe CACHEint is configured to be paired with the CACHEcel by residingrespectively in a pair of Segments having each corresponding LBLconnected by a Segment-divided device gated by a common TIE signal,wherein the TIE signal is set to be greater than Vinh for concurrentlyprecharging the metal parasitic capacitors associated with thecorresponding LBLs and is set to Vss for isolating each other to allowone pseudo CACHE to retain and latch the charges therein while thepaired pseudo CACHE to discharge or perform LBL-GBL charge-sharing form-page concurrent operation.
 70. The method of claim 66 wherein each ofthe first/second/third/fourth pseudo CACHE register is configured tostore externally-loaded SLC, MLC MSB, and MLC LSB page data converted toVinh/Vss pattern, internally-generated transient page data in Vinh/Vsspattern during MLC B′-adjustment before All-BL Program,internally-generated transient page data in Vinh/Vss pattern during MLCProgram-Verify, internally-generated page data precharged Vinh; andinternally-generated in Vinh/Vss pattern by Read operation.
 71. Themethod of claim 64 wherein the initial program B′-state comprises a Vtdistribution divided to two sections of B1′-state and B2′-state by aVtbmin value preset for a minimum Vt value of a second program state ofa programmed MLC cell, wherein the B1′-state and B2′-state are subjectedto a B′-adjustment bit-flipping operation upon loading of MLC (LSB) datato generate a final second program logic state B-state and a final thirdprogram logic state C-state in the MLC (LSB) Program operation.
 72. Themethod of claim 64 wherein the second number is twice of the firstnumber so that each GBL per Group is associated with two LBLs perSegment respectively coupled by a first Segment-select transistor gatedby a SEGo signal for an odd-numbered LBL and a second Segment-selecttransistor gated by a SEGe signal for an even-numbered LBL.
 73. Themethod of claim 72 wherein the SEGe signal and SEGo signal areindependently set with bias conditions of setting the SEGe signalgreater than the Vinh and the SEGo signal Vss to select theeven-numbered LBLs of one selected Segment only for respectivelycharge-sharing with the corresponding first number of GBLs; setting theSEGo signal greater than the Vinh and the SEGe signal to Vss to selectthe odd-numbered LBLs of one selected Segment only for respectivelycharge-sharing with the corresponding first number of GBLs; setting boththe SEGe signal and the SEGo signal to Vss for preventing any LBL-GBLcharge-sharing operation.
 74. The method of claim 64 wherein the secondnumber is four times of the first number so that each GBL per Group isassociated with four LBLs per Segment respectively coupled by a firstSegment-select transistor gated by a SEGa signal for a first LBL, asecond Segment-select transistor gated by a SEGb signal for a secondLBL, a third Segment-select transistor gated by a SEGc signal for athird LBL, and a fourth Segment-select transistor gated by a SEGd signalfor a fourth LBL, wherein each only one of the SEGa, SEGb, SEGc, andSEGd signals is set to be greater than Vinh for allow corresponding ¼number of LBLs of one selected Segment to perform LBL-GBL charge-sharingwith the corresponding first number of GBLs.
 75. The method of claim 64wherein each Group-divided device, each Segment-selected device, eachSegment-divided device, each P/D transistor, each top String-selectdevice, and each bottom String-select device is a same type NMOS 1-polymedium-high-voltage (MHV) transistor.
 76. The method of claim 64 whereinM is selected from 8, 16, 32, 64, 128 or other integer numbers dependingon NAND design density.
 77. The method of claim 64 wherein the secondnumber is 65,536 for 8 KB Page size and configured to couple with thedata buffer scaled down to 4 KB size or 2 KB size.
 78. The method ofclaim 64 wherein the data buffer comprises three circuits with same bitlength, including a Multiplier circuit per bit for a first amplificationof a small analog cell signal to an multiplied analog signal, alatch-type Sense Amplifier (SA) circuit per bit for a second analogamplification of the multiplied analog signal and convert to a fulldigital signal, and a Program/Read buffer (P/RB) circuit per bit fortemporarily storing 1-bit data.
 79. The method of claim 78 wherein thedata buffer comprises total number of bits equal to the first number ofGBLs, the total number of bits in the data buffer being reduced by halfif the second number of LBLs represented to a number of NAND cells in apage is twice of the first number of GBLs, the total number of bitsbeing further scaled down to ¼ if the second number of LBLs is fourtimes of the first number of GBLs.
 80. The method of claim 78 whereinthe P/RB circuit is configured to set “0” bit data to pass Vss voltageto channel of a program cell through each corresponding GBL and LBL andto set “1” bit data to pass Vdd voltage to channel of a program-inhibitcell through each corresponding GBL and LBL.
 81. The method of claim 78wherein the P/RB circuit comprises a pair of latch nodes as a first pairof storage nodes and a pair of gated capacitors as a second storagenodes so that an extra temporary storage bits are created to allow moreflexible m-page all-BL concurrent MLC MSB and LSB Program-Verify and bitflipping logic operations.
 82. The method of claim 78 wherein theMultiplier circuit comprises an input port receiving a first analogvoltage coupled to every drain node of N+1 first transistors and anoutput port outputting a second analog voltage from a drain node of afirst one of the N+1 first transistors, the Multiplier circuit furthercomprises N capacitors being respectively inserted between two drainnodes of two adjacent first transistors, the Multiplier circuit furthercomprises N second transistors being respectively coupled drain nodes oflast N first transistors and ground, thereby outputting the secondanalog voltage equal to N-fold of the first analog voltage, where N isan integer≧1.
 83. The method of claim 64 wherein the Segment-decodercomprises a three-input pre-decoder and n number of latch circuits and alocal HV pump circuit with n SEGp inputs, a VHH input, and correspondingn outputs of n SEG signals respectively for selecting partial section ofa Segment, and further comprises a plurality of control signals torespectively set and clear the n latch circuits and enable and disablethe local HV pump circuit to determine a HXS node voltage forcontrolling voltage charging, latching, and discharging between the nSEGp inputs and corresponding n outputs connected to the n SEG signalsper Segment for performing multi-page all-BL Program, Program-Verify,Erase-Verify, and Read operation.
 84. The method of claim 83 wherein theSegment-decoder comprises a function to instantly set the HXS node toVss by setting one ESB signal of the plurality control signals withone-shot pulse of Vdd for a preset duration when an unintentional Vddpower lose is detected, allowing the n SEG signals to be set to Vss sothat inhibit voltage Vinh precharged to the LBLs can be immediatelysaved after unexpected power-down but can be reused to continue theoperations after power back within a certain idle time.
 85. The methodof claim 64 wherein the Block-decoder comprises a latch circuit coupledwith one pre-decoder with three address inputs and a local HV pumpcircuit with a set of M+2 inputs of M XTs, 1 GSLp, and 1 SSLp, a VHHinput, and corresponding M+2 outputs coupled to one set of M WLs, SSL,and GSL lines per Block, and further comprises a plurality of controlsignals to set and clear the latch circuit and enable and disable thelocal HV pump circuit to determine a HXD node voltage for controllingvoltage charging, latching, and discharging between the M+2 inputs of MXTs, 1 GSLp, and 1 SSLp and M+2 outputs connected to a set of M WLs,SSL, and GSL lines of a selected Block for m-page all-BL Program,Program-Verify, Erase-Verify, and Read operation.
 86. The method ofclaim 85 wherein the HXD node is controlled to be a program voltage Vpgmramped to 15˜25V plus a cell threshold Vt margin during a SLC or MLCProgram operation, or to be a read voltage of Vread of 6-8V plus a Vtmargin during a SLC or MLC Read operation or a SLC or MLC Program-Verifyoperation or a SLC or MLC random-page Erase-Verify operation, or to beVss for latching voltages for a set of M WLs, SSL, and GSL lines of aselected Block.
 87. The method of claim 85 wherein the Block-decoderfurther comprises a function to immediately set the HXD node to Vss bysetting an ENB signal of the plurality of control signals with one-shotpulse of Vdd for a preset duration when an unintentional Vdd power loseis detected, allowing the voltages of one set of M WLs, SSL, and GSLlines of the selected Block to be locked to continue last operation. 88.The method of claim 64 wherein sharing charges comprises connecting eachof the first number of GBLs of a selected Group to one of the secondnumber of LBLs of a selected Segment in the selected Group via oneSegment-select transistor and further optionally connecting the GBL toother GBLs associated with other Groups by turning on correspondingGroup-divided devices so that charges in a lower-level metal parasiticcapacitor associated with the one LBL in the selected Segment are sharedwith one or more top-level metal parasitic capacitors respectivelyassociated with one or more connected GBLs of the selected Groups andconnected other Groups.
 89. The method of claim 64 wherein prechargingthe inhibit voltage Vinh comprises charging each lower-level metalparasitic capacitor associated with one of the second number of LBLsfrom the common power line per Segment by turning on a corresponding P/Dtransistor, the inhibit voltage Vinh being larger than Vdd up to 10V forusing a super self-boosting Program-Inhibit (SSBPI) scheme in NAND cellProgram operation with reduced channel voltage boosting.
 90. The methodof claim 88 wherein sharing charges further comprises connecting ½ or ¼of the second number of LBLs in one selected Segment with thecorresponding the first number of GBLs by turning on corresponding onerow of Segment-select transistors, depending on whether the secondnumber is twice or four times of the first number, while disconnectingrest of ½ or ¾ of the second number of LBLs with the corresponding GBLsby turning off the remain one or three rows of Segment-selecttransistors to retain the inhibit voltage Vinh therein withoutcharge-sharing but ready for subsequent charge-sharing operation. 91.The method of claim 89 wherein the inhibit voltage Vinh comprises avariable value ranging from Vdd for the selected Segment in a selectedGroup nearest to the data buffer to 10V for the selected Segment in aselected Group farest to the data buffer to save LBL precharge powerconsumption, wherein the Vinh value is one between Vdd and 10V for theselected Segment in a selected Group between the nearest one and farestone depending on how many number of Groups are connected from theselected Group to the data buffer.
 92. The method of claim 64 whereinthe plane comprises at least one hybrid Block having interleavely mixedpages for respectively storing SLC and MLC data for reducing MLC WL-WLcoupling effect.
 93. The method of claim 92 wherein the hybrid Block isconfigured to place all SLC WLs in odd-numbered WLs thereof and all MLCWLs in even-numbered WLs thereof.
 94. The method of claim 64 wherein thefirst number of GBLs are top-level metal lines in parallel to the columndirection characterized by a pitch size of 4 or 8 base units and alength of one Group size and the second number of LBLs are lower-levelmetal lines below the GBLs in parallel to the column directioncharacterized by a pitch size of 2 base units and a length of oneSegment.
 95. The method of claim 64 wherein each common power line perSegment and each common source line per Block are conductive lines madeof metal or conductive polymer formed below the LBLs and perpendicularto the column direction.
 96. The method of claim 64 wherein selectivelydischarging-and-retaining precharged inhibit voltage comprisescontrolling one or more rows of Segment-select transistors to openconnection between the LBLs and corresponding GBLs and also controllingone or more rows of P/D transistors to close connection the LBLs to thecommon power line to partially discharge the Vinh from some LBLs to Vssand partially retain the precharged Vinh for rest LBLs to convert ½ or ¼page data with a Vdd/Vss pattern stored in the data buffer to acorresponding ½ or ¼ page data with a Vinh/Vss pattern latched in thecorresponding LBLs of the selected Segment for m-page all-BL concurrentand pipeline operation.
 97. A method for operating a HiNAND2 memory chipwith two-level bit line (BL) hierarchy array structure during multi-pageconcurrent MLC (LSB) Program and Program-Verify operation, the methodcomprising: providing a HiNAND2 memory chip in connection with a hostand a flash controller via I/Os circuit, the HiNAND2 memory chipcomprising: a plane of NAND cells formed on a common Triple-Pwell (TPW)region over a deep-Nwell region on a P-substrate, the plane comprising JGroups arranged in column direction, each Group being associated with Nglobal bit lines (GBLs) arranged in the column direction as top-levelmetal lines and being separated from adjacent Groups by a row ofGroup-divided devices, each of the N Groups comprising L Segmentsarranged in the column direction, each of the L Segments beingassociated with 2N local bit lines (LBLs) disposed as lower-level metallines in parallel to the GBLs and being separated from adjacent Segmentsby a row of Segment-divided devices, the N GBLs being coupled to Nodd-numbered LBLs and N even-numbered LBLs respectively by a first rowof N Segment-select transistors gated by a SEGo signal and a second rowof N Segment-select transistors gated by a SEGe signal, the Nodd/even-numbered LBLs being coupled to a common power line respectivelythrough a first/second row of N P/D transistors commonly gated by aPREo/PREe signal, each Segment being paired with a neighboring Segmentby a row of Segment-divided devices commonly gated by a TIE signal, eachSegment comprising K Blocks wherein each Block comprises the 2N Stringsone-to-one parallelly coupled to the 2N LBLs and each String comprisingM cells connected in series and capped by a top String-select transistorand a bottom String-select transistor, the top String-select transistorconnecting its drain to one corresponding LBL, the bottom String-selecttransistor connecting its source to a common source line disposed inparallel to but not connected to the common power line, the 2N Stringsin a Block forming M pages of cells in row direction respectively gatedby M word lines (WLs) and all corresponding top String-selecttransistors are gated by a SSL line and all corresponding bottomString-select transistors are gated by a GSL line, J, L, K, N, beinginteger chose based on NAND cell density; a data buffer configured tostore N-bits of partial page SLC or MLC data received from external I/Oswhich are further transferred and stored in selected 2N LBLs locallyprecharged with an inhibit voltage Vinh from the corresponding commonpower line in one more sequential cycles through the N GBLs; aSegment-decoder configured with two latches to respectively control theSEGo signal and the SEGe signal for controlling connection between eachGBL and one odd-numbered LBL and one even-numbered LBL; a Block-decoderconfigured with one latch to connect or disconnect a set of M XTs, 1SSLp, and 1 GSLp bus lines shared to m sets of M WLs, 1 SSL line, and 1GSL line on one-set-per-Block basis, m being an integer equal to 1 orgreater; precharging, per one selected page in one Segment, four sets ofLBLs associated with two paired Segments with the inhibit voltage Vinhup to 10V simultaneously in one cycle from one common power line perSegment, each set of LBL being associated with a pseudo CACHE; loadingand latching MCL (LSB) page data in pipeline manner in two cycles perpage up to m pages from the data buffer by utilizing four (two paired)respective pseudo CACHEs with precharged Vinh, m being an integer largerthan 1, each of the m pages being selected on one-page-per-Block basisfrom m Blocks selected on one-Block-per-Segment basis from m Segments inone or more Groups in the plane; setting and latching correspondingvoltages to m sets of M WLs, 1 SSL line, and 1 GSL line for the m Blockseach with one selected page; selectively discharging-and-retaining theprecharged inhibit voltage Vinh in the 2N LBLs on page-by-page basis forthe m pages simultaneously in one cycle; performing m-page Programconcurrently for increasing each cell logic state from one initial erasestate to two logic states including an erase E-state with a thresholdlevel Vt<0V and an initial program B′-state with a predetermined minimumVt value set to be no smaller than a maximum Vt value set for a finallowest program logic state A-state.
 98. The method of claim 97 whereinperforming m-page Program further comprises adjusting the initialprogram B′-state comprising a Vt distribution divided to two sections ofB1′-state and B2′-state by a Vtbmin value; Converting B1′-state andB2′-state to generate a final second program logic state B-state and afinal third program logic state C-state.
 99. The method of claim 98further comprising operation per page before loading: discharging theVinh in a first paired CACHE, a CACHEcel with the selected page beingset with a WL voltage at a minimum Vt value Vtamin set for the A-stateand a CACHEint paired with the CACHEcel, to store and latch updated datatherein; discharging the Vinh in the CACHEcel with the selected pagebeing set with a WL voltage at a minimum Vt value set for a final secondprogram logic state B-state while retaining and latching data in theCACHEint.
 100. The method of claim 99 further comprising: transferringlast data from the CACHEint per bit to two first storage nodes of aProgram/Read Buffer in the data buffer; transferring MLC MSB page datapreviously stored in one of second paired CACHE, CACHEmsb, per bit to aSense Amplifier in the data buffer through a Multiplier foramplification and then writing back to the CACHEmsb in same datapolarity, and further transferring into one second storage node of theProgram/Read Buffer; verifying currently updated data from the CACHEcelin the Sense Amplifier per bit using the Vtamin as a verify voltageagainst data in both the first storage nodes and a second storage nodeof the Program/Read Buffer per bit based on Vt distribution of theA-state.
 101. The method of claim 100 further comprising: transferringMLC LSB page data previously stored in another one of the second pairedCACHE, CACHE1sb, per bit to the Sense Amplifier through a Multiplier foramplification and then writing back to the CACHE1sb in same datapolarity and transferring into one second storage node of theProgram/Read Buffer per bit; verifying currently updated data from theCACHEcel per bit in the Sense Amplifier using a predetermined Vtbminverify voltage against data in the first storage nodes transferred fromthe CACHEint and data in second storage node of the Program/Read Bufferper bit based on Vt distribution of the B-state.
 102. The method ofclaim 101 further comprising: verifying currently updated data from theCACHEcel per bit in the Sense Amplifier using a predetermined Vtcminverify voltage against only data in the first store nodes transferredfrom the CACHEint based on Vt distribution of the C-state; updatingverified data in the CACHEcel and the CACHEint; continuously performingnext iterative All-BL MLC LSB Program operation based on the updateddata in the CACHEcel until the MLC Program-Verify operation is passed.103. The method of claim 97 wherein each Group-divided device, eachSegment-selected device, each Segment-divided device, each P/Dtransistor, each top String-select device, and each bottom String-selectdevice is a same type NMOS 1-poly medium-high-voltage (MHV) transistor.104. The method of claim 97 wherein M is selected from 8, 16, 32, 64,128 or other integer numbers depending on NAND design density.
 105. Themethod of claim 97 wherein N-bit is assigned to be 4 KB with 2N-bit 8 KBbeing set as a page size so that the data buffer is scaled down by half.106. The method of claim 97 wherein the data buffer comprises threecircuits with same bit length, including a Multiplier circuit per bitfor a first amplification of a small analog cell signal to an multipliedanalog signal, a latch-type Sense Amplifier (SA) circuit per bit for asecond analog amplification of the multiplied analog signal and convertto a full digital signal, and a Program/Read buffer (P/RB) circuit perbit for temporarily storing 1-bit data.
 107. The method of claim 106wherein the P/RB circuit is configured to set “0” bit data to pass Vssvoltage to channel of a program cell through each corresponding GBL andLBL and to set “1” bit data to pass Vdd voltage to channel of aprogram-inhibit cell through each corresponding GBL and LBL.
 108. Themethod of claim 106 wherein the P/RB circuit comprises a pair of latchnodes as a first pair of storage nodes and a pair of gated capacitors asa second storage nodes so that an extra temporary storage bits arecreated to allow more flexible m-page all-BL concurrent MLC MSB and LSBProgram-Verify and bit flipping logic operations.
 109. The method ofclaim 106 wherein the Multiplier circuit comprises an input portreceiving a first analog voltage coupled to every drain node of N+1first transistors and an output port outputting a second analog voltagefrom a drain node of a first one of the N+1 first transistors, theMultiplier circuit further comprises N capacitors being respectivelyinserted between two drain nodes of two adjacent first transistors, theMultiplier circuit further comprises N second transistors beingrespectively coupled drain nodes of last N first transistors and ground,thereby outputting the second analog voltage equal to N-fold of thefirst analog voltage, where N is an integer≧1.
 110. The method of claim97 wherein the Segment-decoder comprises a three-input pre-decoder and nnumber of latch circuits and a local HV pump circuit with a SEGpo inputand a SEPpe input, a VHH input, and corresponding n outputs of the SEGosignal and the SEGe signal respectively for selecting odd-numberedsection of a Segment and even-numbered section of the Segment, andfurther comprises a plurality of control signals to respectively set andclear the n latch circuits and enable and disable the local HV pumpcircuit to determine a HXS node voltage for controlling voltagecharging, latching, and discharging between the two SEGpo, SEGpe inputsand corresponding two outputs connected to the two SEGo, SEGe signalsper Segment for performing multi-page all-BL Program, Program-Verify,Erase-Verify, and Read operation.
 111. The method of claim 110 whereinthe Segment-decoder comprises a function to instantly set the HXS nodeto Vss by setting one ESB signal of the plurality control signals withone-shot pulse of Vdd for a preset duration when an unintentional Vddpower lose is detected, allowing the two SEG signals to be set to Vss sothat inhibit voltage Vinh precharged to the LBLs can be immediatelysaved after unexpected power-down but can be reused to continue theoperations after power back within a certain idle time.
 112. The methodof claim 97 wherein the Block-decoder comprises a latch circuit coupledwith one pre-decoder with three address inputs and a local HV pumpcircuit with a set of M+2 inputs of M XTs, 1 GSLp, and 1 SSLp, a VHHinput, and corresponding M+2 outputs coupled to one set of M WLs, SSL,and GSL lines per Block, and further comprises a plurality of controlsignals to set and clear the latch circuit and enable and disable thelocal HV pump circuit to determine a HXD node voltage for controllingvoltage charging, latching, and discharging between the M+2 inputs of MXTs, 1 GSLp, and 1 SSLp and M+2 outputs connected to a set of M WLs,SSL, and GSL lines of a selected Block for m-page all-BL Program,Program-Verify, Erase-Verify, and Read operation.
 113. The method ofclaim 112 wherein the HXD node is controlled to be a program voltageVpgm ramped to 15˜25V plus a cell threshold Vt margin during a SLC orMLC Program operation, or to be a read voltage of Vread of 6-8V plus aVt margin during a SLC or MLC Read operation or a SLC or MLCProgram-Verify operation or a SLC or MLC random-page Erase-Verifyoperation, or to be Vss for latching voltages for a set of M WLs, SSL,and GSL lines of a selected Block.
 114. The method of claim 112 whereinthe Block-decoder further comprises a function to immediately set theHXD node to Vss by setting an ENB signal of the plurality of controlsignals with one-shot pulse of Vdd for a preset duration when anunintentional Vdd power lose is detected, allowing the voltages of oneset of M WLs, SSL, and GSL lines of the selected Block to be locked tocontinue last operation.
 115. The method of claim 97 wherein the inhibitvoltage Vinh comprises a variable value ranging from Vdd for theselected Segment in a selected Group nearest to the data buffer to 10Vfor the selected Segment in a selected Group farest to the data bufferto save LBL precharge power consumption, wherein the Vinh value is onebetween Vdd and 10V for the selected Segment in a selected Group betweenthe nearest one and farest one depending on how many number of Groupsare connected from the selected Group to the data buffer.
 116. Themethod of claim 97 wherein the plane comprises at least one hybrid Blockhaving interleavely mixed pages for respectively storing SLC and MLCdata for reducing MLC WL-WL coupling effect.
 117. The method of claim116 wherein the hybrid Block is configured to place all SLC WLs inodd-numbered WLs thereof and all MLC WLs in even-numbered WLs thereof.118. The method of claim 97 wherein the N GBLs are top-level metal linesin parallel to the column direction characterized by a pitch size of 4or 8 base units and a length of one Group size and the 2N LBLs arelower-level metal lines below the GBLs in parallel to the columndirection characterized by a pitch size of 2 base units and a length ofone Segment.
 119. The method of claim 97 wherein each common power lineper Segment and each common source line per Block are conductive linesmade of metal or conductive polymer formed below the LBLs andperpendicular to the column direction.
 120. The method of claim 97wherein each NAND cell in the plane comprises a high-voltage 2-polyfloating-gate NMOS transistor, or an 1-poly charge-trapping MONOS orSONOS transistor, or 2-poly floating-gate or 1-poly charge trappingNitride layer 3D transistor, or Vertical-gate or Vertical channel 3DNAND transistor.
 121. A method for operating a HiNAND2 memory chip withtwo-level bit line (BL) hierarchy array structure during multi-pageconcurrent SLC and MLC Read operation, the method comprising: providinga HiNAND2 memory chip in connection with a host and a flash controllervia I/Os circuit, the HiNAND2 memory chip comprising: a plane of NANDcells formed on a common Triple-Pwell (TPW) region over a deep-Nwellregion on a P-substrate, the plane comprising J Groups arranged incolumn direction, each Group being associated with N global bit lines(GBLs) arranged in the column direction as top-level metal lines andbeing separated from adjacent Groups by a row of Group-divided devices,each of the N Groups comprising L Segments arranged in the columndirection, each of the L Segments being associated with 2N local bitlines (LBLs) disposed as lower-level metal lines in parallel to the GBLsand being separated from adjacent Segments by a row of Segment-divideddevices, the N GBLs being coupled to N odd-numbered LBLs and Neven-numbered LBLs respectively by a first row of N Segment-selecttransistors gated by a SEGo signal and a second row of N Segment-selecttransistors gated by a SEGe signal, the N odd/even-numbered LBLs beingcoupled to a common power line respectively through a first/second rowof N P/D transistors commonly gated by a PREo/PREe signal, each Segmentbeing paired with a neighboring Segment by a row of Segment-divideddevices commonly gated by a TIE signal, each Segment comprising K Blockswherein each Block comprises the 2N Strings one-to-one parallellycoupled to the 2N LBLs and each String comprising M cells connected inseries and capped by a top String-select transistor and a bottomString-select transistor, the top String-select transistor connectingits drain to one corresponding LBL, the bottom String-select transistorconnecting its source to a common source line disposed in parallel tobut not connected to the common power line, the 2N Strings in a Blockforming M pages of cells in row direction respectively gated by M wordlines (WLs) and all corresponding top String-select transistors aregated by a SSL line and all corresponding bottom String-selecttransistors are gated by a GSL line, J, L, K, N, being integer chosebased on NAND cell density; a data buffer configured to store N-bits ofpartial page SLC or MLC data received from external I/Os which arefurther transferred and stored in selected 2N LBLs locally prechargedwith an inhibit voltage Vinh from the corresponding common power line inone more sequential cycles through the N GBLs; a Segment-decoderconfigured with two latches to respectively control the SEGo signal andthe SEGe signal for controlling connection between each GBL and oneodd-numbered LBL and one even-numbered LBL; a Block-decoder configuredwith one latch to connect or disconnect a set of M XTs, 1 SSLp, and 1GSLp bus lines shared to m sets of M WLs, 1 SSL line, and 1 GSL line onone-set-per-Block basis, m being an integer equal to 1 or greater;precharging the 2N LBLs associated with a selected page per Segment upto m pages with the inhibit voltage Vinh up to 10V simultaneously in onecycle from one common power line per Segment, the inhibit voltage Vinhbeing stored and latched in 2N metal parasitic capacitors formed by the2N LBLs serving as a pseudo CACHE; discharging the Vinh charge of thepseudo CACHE in the selected Segment selectively with a predeterminedVR1 read voltage being applied to each selected page; performingcharge-sharing between the 2N LBLs per Segment and the data buffer via NGBLs in two cycles while identifying cells with an initial program statewith Vt>0 in each page.
 122. The method of claim 121 wherein the initialprogram state is a transient B′-state for reading MLC (MSB) datautilizing one pseudo CACHE for storing and latching temporary data forusing the predetermined VR1 read voltage to distinguish the B′-statefrom erase E-state per cell on half-page basis in two cycles, whereinthe B′-state is characterized by a predetermined minimum Vt value set tobe no smaller than a maximum Vt value set for a final lowest programlogic state A-state for MLC (LSB) data, with partially overlapping timeintervals in a concurrent/pipeline manner regardless of m random pagesor m non-random pages with a condition that each page includes a Flagcell assigned to 1 to indicate each addressed MLC-WL storing 2-Vt of MSBpage data.
 123. The method of claim 122 further comprising: utilizingthree pseudo CACHEs respectively associated three Segments to precharge,latch, and discharge temporary read data for reading m pages of MLC (LSBpage) data and using the predetermined VR1 read voltage, a predeterminedVR2 read voltage, and a predetermined VR3 read voltage to distinguishfour logic states per cell including an erase E-state, a first programA-state, a second program B-state, and a third program C-state, whereinthe second program B-state and the third program C-state are generatedby adjusting the initial program state B′-state through one or morebit-flipping operation, with a condition that each page includes a Flagcell assigned to 0 to indicate each addressed MLC-WL storing 4-Vt ofboth MSB and LSB page data.
 124. The method of claim 123 furthercomprising: utilizing a first CACHEcel and a second CACHEintrespectively associated with two Segments tied as a pair by a row ofSegment-divided devices commonly gated by the TIE signal tosimultaneously store and latch temporary read MLC (LSB) page data viathe VR1 read voltage for distinguishing the E-state from A-state,B-state, and C-state; setting the TIE signal to Vss to isolate thesecond CACHEint from the first CACHEcel; utilizing the first CACHEcelfor storing and latching temporary read page data via the VR2 readvoltage for distinguishing the E-state and A-state from the B-state andC-state, the page data in the first CACHEcel being latched with a MSBpage data; transferring the MSB page data to a Sense Amplifier of thedata buffer per bit and writing a page data having same polarities backand latch to a third CACHEmsb associated with a third Segment not pairedwith the two Segments; utilizing the first CACHEcel for storing andlatching temporary read page data via the VR3 read voltage fordistinguishing the E-state, A-state, and B-state from the C-state;restoring the page data in the second CACHEint to two first storagenodes of a Program/Read Buffer in the data buffer per bit; restoring thepage data in the third CACHEmsb to the Sense Amplifier per bit andtransferring to two second storage nodes of the Program/Read Buffer perbit; restoring the page data in the first CACHEcel to the SenseAmplifier per bit; flipping bit polarity of each B-state bit per cell inthe Program/Read Buffer; and reading out data from the P/RB per bit forto obtain MLC LSB page data.